Functional near infrared spectroscopy (fNIRS) has been gaining increasing interest as a practical mobile functional brain imaging technology for understanding the neural correlates of social cognition and emotional processing in the human prefrontal cortex (PFC). Considering the cognitive complexity of human-robot interactions, the aim of this study was to explore the neural correlates of emotional processing of congruent and incongruent pairs of human and robot audio-visual stimuli in the human PFC with fNIRS methodology. Hemodynamic responses from the PFC region of 29 subjects were recorded with fNIRS during an experimental paradigm which consisted of auditory and visual presentation of human and robot stimuli. Distinct neural responses to human and robot stimuli were detected at the dorsolateral prefrontal cortex (DLPFC) and orbitofrontal cortex (OFC) regions. Presentation of robot voice elicited significantly less hemodynamic response than presentation of human voice in a left OFC channel. Meanwhile, processing of human faces elicited significantly higher hemodynamic activity when compared to processing of robot faces in two left DLPFC channels and a left OFC channel. Significant correlation between the hemodynamic and behavioral responses for the face-voice mismatch effect was found in the left OFC. Our results highlight the potential of fNIRS for unraveling the neural processing of human and robot audio-visual stimuli, which might enable optimization of social robot designs and contribute to elucidation of the neural processing of human and robot stimuli in the PFC in naturalistic conditions.