Successful information processing requires the focusing of attention on a certain stimulus property and the simultaneous suppression of irrelevant information. The Stroop task is a useful paradigm to study such attentional top-down control in the presence of interference. Here, we investigated the neural correlates of an auditory Stroop task using fMRI. Subjects focused either on tone pitch (relatively high or low; phonetic task) or on the meaning of a spoken word (high/low/good; semantic task), while ignoring the other stimulus feature. We differentiated between task-related (phonetic incongruent vs. semantic incongruent) and sensory-level interference (phonetic incongruent vs. phonetic congruent). Task-related interference activated similar regions as in visual Stroop tasks, including the anterior cingulate cortex (ACC) and the presupplementary motor-area (pre-SMA). More specifically, we observed that the very caudal/posterior part of the ACC was activated and not the dorsal/anterior region. Because identical stimuli but different task demands are compared in this contrast, it reflects conflict at a relatively high processing level. A more conventional contrast between incongruent and congruent phonetic trials was associated with a different cluster in the pre-SMA/ACC which was observed in a large number of previous studies. Finally, functional connectivity analysis revealed that activity within the regions activated in the phonetic incongruent vs. semantic incongruent contrast was more strongly interrelated during semantically vs. phonetically incongruent trials. Taken together, we found (besides activation of regions well-known from visual Stroop tasks) activation of the very caudal and posterior part of the ACC due to task-related interference in an auditory Stroop task. Hum Brain Mapp 00:000-000, 2009. V V C 2009 Wiley-Liss, Inc.