Facial expressions are a core component of the emotional response of social mammals. In contrast to Darwin's original proposition, expressive facial cues of emotion appear to have evolved to be species-specific. Faces trigger an automatic perceptual process, and so, inter-specific emotion perception is potentially a challenge; since observers should not try to "read" heterospecific facial expressions in the same way that they do conspecific ones. Using dynamic spontaneous facial expression stimuli, we report the first inter-species eye-tracking study on fully unrestrained participants and without pre-experiment training to maintain attention to stimuli, to compare how two different species living in the same ecological niche, humans and dogs, perceive each other's facial expressions of emotion. Humans and dogs showed different gaze distributions when viewing the same facial expressions of either humans or dogs. Humans modulated their gaze depending on the area of interest (AOI) being examined, emotion, and species observed, but dogs modulated their gaze depending on AOI only. We also analysed if the gaze distribution was random across AOIs in both species: in humans, eye movements were not correlated with the diagnostic facial movements occurring in the emotional expression, and in dogs, there was only a partial relationship. This suggests that the scanning of facial expressions is a relatively automatic process. Thus, to read other species' facial emotions successfully, individuals must overcome these automatic perceptual processes and employ learning strategies to appreciate the inter-species emotional repertoire.