Faces and voices are of high importance in interpersonal communication, and there are notable parallels between face and voice perception. However, these parallels do not sit entirely comfortably with the full range of available evidence. This review evaluates parallels between the functional and neural organisation of face and voice perception, whilst locating these in the context of ways in which faces and voices also differ. It takes the discussion to the next level by asking why these commonalities and differences exist. A novel synthesis is offered, grounded in the interaction between intrinsic characteristics of faces and voices and the demands of everyday life, showing how the pattern of findings reflects a system that can respond optimally to different everyday demands.
Highlights• Similarities in functional organisation have led to the proposal of parallel, largely independent processing streams for voices and faces.Linked to this conception is the idea that the voice can be considered to be a kind of 'auditory face'.• However, neuroimaging studies show a strong contribution of multimodal regions that respond both to voices and to faces. Closer examination of neuropsychological and behavioural studies supports this form of organisation.• The contributions of differences between how relatively invariant information (such as a person's identity) and more rapidly changing information (such as their emotional state) must be represented need to be carefully considered.• Understanding the everyday demands of different tasks involving voice and face perception offers a resolution in which these serve as strong drivers of the optimal functional and neural organisation. Young, Frühholz and Schweinberger (cont'd) 3 Understanding face and voice perception Human communication involves complex patterns of signals originating primarily from the face, voice, and body [1]. Whilst much of this communication takes the form of propositional speech, faces and voices can also convey common forms of information concerning a person's gender, age, identity, health and emotional state, and they create impressions of warmth, competence and other social traits [2,3]. Much modern research has therefore focussed on communication from the face and voice [4-9]. This review aims to strengthen theoretical approaches to key properties of face and voice perception. It offers a synthesis of existing evidence based on evaluating functional perspectives (see Glossary) and neural perspectives in light of the overarching background of what can be communicated through faces and voices, the different contingencies thiscreates, the demands of everyday life, and the ways in which these act as determinants of a communicative system that has to balance the needs of the sender and recipient.