The ability to process verbal language seems unique to humans and relies not only on semantics but on other forms of communication such as affective vocalisations, that we share with other primate species—particularly great apes (Hominidae). To better understand these processes at the behavioural and brain level, we asked human participants to categorize vocalizations of four primate species including human, great apes (chimpanzee and bonobo), and monkey (rhesus macaque) during MRI acquisition. Classification was above chance level for all species but bonobo vocalizations. Imaging analyses were computed using a participant-specific, trial-by-trial fitted probability categorization value in a model-based style of data analysis. Model-based analyses revealed the implication of the bilateral orbitofrontal cortex and inferior frontal gyrus pars triangularis (IFGtri) respectively correlating and anti-correlating with the fitted probability of accurate species classification. Further conjunction analyses revealed enhanced activity in a sub-area of the left IFGtri specifically for the accurate classification of chimpanzee calls compared to human voices. Our data therefore reveal distinct frontal mechanisms that shed light on how the human brain evolved to process non-verbal language.