Perceiving and correctly interpreting emotional expressions is one of the most important abilities for social animals' communication. It determines the majority of social interactions, group dynamics, and cooperation, being highly relevant for an individual's survival. Core mechanisms of this ability have been hypothesized to be shared across closely related species with phylogenetic similarities. Here, we explored homologies in human processing of different species' facial expressions using eye-tracking. Introducing a prime-target paradigm, we tested the influences on human attention elicited by priming with differently valenced emotional stimuli depicting human and chimpanzee faces. We demonstrated an attention shift towards the conspecific (human) target picture that was congruent with the valence depicted in the primer picture. We did not find this effect with heterospecific (chimpanzee) primers and ruled out that this was due to participants interpreting them incorrectly. Implications about the involvement of related emotion-processing mechanisms for human and chimpanzee facial expressions, are discussed. Systematic cross-species-investigations of emotional expressions are needed to unravel how emotion representation mechanisms can extend to process other species' faces. Through such studies, we address the gap of a shared evolutionary ancestry between humans and other animals to ultimately answer the question of "Where do emotions come from?".