Direct Structural Connections between Voice- and Face-Recognition Areas

Blank, Helen; Anwander, Alfred; Kriegstein, Katharina von

doi:10.1523/jneurosci.2091-11.2011

Cited by 153 publications

(168 citation statements)

References 77 publications

(143 reference statements)

Supporting

Mentioning

153

Contrasting

Unclassified

Order By: Relevance

“…Moreover, the benefit for crossmodal facilitation on face recognition was specific to voice primes, and did not generalize to learned arbitrary sounds. This result is consistent with previous reports of enhanced person recognition from exposure to unfamiliar voice-face pairings (e.g., Bülthoff & Newell, 2015;colleagues, 2016a, 2016b), as well as neuroimaging findings for direct connections between cortical areas subserving voice and face perception (e.g., Blank et al, 2011;von Kriegstein, Kleinschmidt, & Giraud, 2006). The relative ease of associating voices with faces may be due to the lifetime of experience in being exposed to bimodal stimulation of faces and voices that may subsequently allow voices to have privileged access to faces during learning that then enhances person identification (e.g., Barenholtz et al 2014;von Kriegstein et al, 2008).…”

Section: Discussionsupporting

confidence: 92%

“…Furthermore, Joassin et al (2011) used fMRI to measure cortical activation to voices, faces and combinations of voices and faces. Their findings, that voice-face interactions result in greater activation in regions of the brain including the fusiform gyrus than either voice or face alone, are consistent with those of Blank et al (2011). Other behavioural findings also question the idea of a unimodal face module.…”

mentioning

confidence: 83%

“…On the other hand, more detailed analyses of the neural basis of face recognition raise questions about the encapsulation of a face module. For example, using probabilistic tractography, Blank, Anwander, and von Kriegstein (2011) reported evidence of direct structural connections between the fusiform face area and voice-sensitive regions of the STS. Furthermore, Joassin et al (2011) used fMRI to measure cortical activation to voices, faces and combinations of voices and faces.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Crossmodal priming of unfamiliar faces supports early interactions between voices and faces in person perception

Bülthoff

Newell

2017

Visual Cognition

View full text Add to dashboard Cite

Section: Discussionsupporting

confidence: 92%

mentioning

confidence: 83%

mentioning

confidence: 99%

See 1 more Smart Citation

Crossmodal priming of unfamiliar faces supports early interactions between voices and faces in person perception

Bülthoff

Newell

2017

Visual Cognition

View full text Add to dashboard Cite

“…Note that the first of these predictions in particular is significantly more specific than the predictions motivating our prior study (Gottlieb et al 2010). In the present case, subsequent memory effects are predicted not merely in the auditorily responsive cortex, but in regions-such as the middle superior temporal sulcus (STS)-that support voice identification (Kriegstein and Giraud 2004;Belin 2006;Blank et al 2011). Second, we addressed the question of whether successful conjoint encoding of the two contextual features would be associated with cortical subsequent memory effects additional to those elicited by the encoding of single features (cf.…”

mentioning

confidence: 58%

Neural correlates of the encoding of multimodal contextual features

et al. 2012

View full text Add to dashboard Cite

Functional magnetic resonance imaging (fMRI) was employed to identify neural regions engaged during the encoding of contextual features belonging to different modalities. Subjects studied objects that were presented to the left or right of fixation. Each object was paired with its name, spoken in either a male or a female voice. The test requirement was to discriminate studied from unstudied pictures and, for each picture judged old, to retrieve its study location and the gender of the voice that spoke its name. Study trials associated with accurate rather than inaccurate location memory demonstrated enhanced activity in the fusiform and parahippocampal cortex and the hippocampus and reduced activity (a negative subsequent memory effect) in the medial occipital cortex. Successful encoding of voice information was associated with enhanced study activity in the right middle superior temporal sulcus and activity reduction in the right superior frontal cortex. These findings support the proposal that encoding of a contextual feature is associated with enhanced activity in regions engaged during its online processing. In addition, they indicate that negative subsequent memory effects can also demonstrate feature-selectivity. Relative to other classes of study trials, trials for which both contextual features were later retrieved demonstrated enhanced activity in the lateral occipital complex and reduced activity in the temporo-parietal junction. These findings suggest that multifeatural encoding was facilitated when the study item was processed efficiently and study processing was not interrupted by redirection of attention toward extraneous events.Episodic memories-memories for unique events-depend on the ability to associate or bind the various components of an event into a common memory representation. In laboratory studies, experimentally manipulated components of an event frequently include an "item" that is the focus of attention and one or more contextual features. Memory is tested using a source memory procedure in which, for each correctly recognized studied item, subjects are required to retrieve its associated contextual feature(s). Successful source judgments are assumed to be indicative of memory representations for which item and context information were successfully associated or bound together at the time of encoding.A sizeable literature has developed in which event-related functional magnetic resonance imaging (fMRI) was employed to identify the neural correlates of successful encoding of itemcontext associations through the use of the "subsequent memory procedure" (Paller and Wagner 2002). In these studies, successful source encoding is consistently reported to be associated with enhanced activity (relative to study items for which later source retrieval failed) in the hippocampus and adjacent regions of the medial temporal lobe (MTL) (e.g

show abstract

“…However a PIN, as defined in Bruce & Young (1986), would correspond to a patient with a brain Scientific RepoRts | 6:37494 | DOI: 10.1038/srep37494 lesion preserving recognition and feeling of familiarity based on single modalities separately but who could not retrieve semantic information on the person, and not associate the face and voice of the person; such a patient has not yet been identified 1 . Other studies suggest that amodal representations could rather emerge from cross-talk interactions between modality-specific areas 1 : voice and face-sensitive areas are not only connected via direct anatomical projections 15 but also functionally connected during familiar voice recognition 16 . Multi-voxel pattern analyses (MVPA) offer a powerful means of extracting information contained in distributed fMRI activity 17,18 : their enhanced sensitivity compared to classical univariate fMRI analyses has contributed to clarifying the neural correlates of unimodal face [19][20][21][22][23] or voice 24,25 identity processing.…”

mentioning

confidence: 99%

“Hearing faces and seeing voices”: Amodal coding of person identity in the human brain

Belin

Hasan

Groß

et al. 2016

International Journal of Psychophysiology

View full text Add to dashboard Cite

Recognizing familiar individuals is achieved by the brain by combining cues from several sensory modalities, including the face of a person and her voice. Here we used functional magnetic resonance (fMRI) and a whole-brain, searchlight multi-voxel pattern analysis (MVPA) to search for areas in which local fMRI patterns could result in identity classification as a function of sensory modality. We found several areas supporting face or voice stimulus classification based on fMRI responses, consistent with previous reports; the classification maps overlapped across modalities in a single area of right posterior superior temporal sulcus (pSTS). Remarkably, we also found several cortical areas, mostly located along the middle temporal gyrus, in which local fMRI patterns resulted in identity "cross-classification": vocal identity could be classified based on fMRI responses to the faces, or the reverse, or both. These findings are suggestive of a series of cortical identity representations increasingly abstracted from the input modality.• Local patterns of cerebral activity measured with fMRI can classify familiar faces or voices.• Overlap of face-and voice-classifying areas in right posterior STS.• Cross-classification of facial and vocal identity in several temporal lobe areas.The ability to recognize familiar individuals is of high importance in our social interactions. The human brain achieves this by making use of cues from several sensory modalities, including visual signals from the face of a person and auditory signals from her voice 1,2 . There is evidence that these cues are combined across senses to yield more accurate, more robust representations of person identity-a clear case of multisensory integration 3-5 . For instance, familiar speaker recognition is faster and more accurate when the voice is paired with a time-synchronized face from the same individual than when presented alone, and slower and less accurate when paired with the face of a different individual 3 . The contribution of different sensory modalities to person perception is acknowledged in particular by cognitive models such as Bruce and Young (1986)'s model of face perception. Specifically they proposed the notion of "person identity nodes" (PINs): a portion of associative memory holding identity-specific semantic codes that can be accessed via the face, the voice or other modalities: it is the point at which person recognition, as opposed to face recognition, is achieved 6,7 . Whether the PINs have a neuronal counterpart in the human brain remains unclear, in part owing to the fact that most studies of person recognition-either using neuropsychological assessment of patients with brain lesions, or neuroimaging techniques such as functional magnetic resonance imaging (fMRI) in healthy volunteers-have focused on single modality, mostly face, then, far second, voice; only few studies have investigated the cerebral bases of person recognition based on more than one sensory modality 1,4

show abstract

Direct Structural Connections between Voice- and Face-Recognition Areas

Cited by 153 publications

References 77 publications

Crossmodal priming of unfamiliar faces supports early interactions between voices and faces in person perception

Crossmodal priming of unfamiliar faces supports early interactions between voices and faces in person perception

Neural correlates of the encoding of multimodal contextual features

“Hearing faces and seeing voices”: Amodal coding of person identity in the human brain

Contact Info

Product

Resources

About