Our research consists of studying the probability of humans being able to assess the differences between the valence of human faces in combination with simultaneous human vocalizations of high (pain and pleasure) and low (smile/laughter and neutral expression/speech) intensities. The study was conducted online and used a large sample (n=902) of respondents. The task was to categorize whether human vocalizations and facial expressions that can be considered semi-naturalistic were rated positive, neutral, or negative when presented with audio stimuli and pictures of faces. These had been extracted from freely downloadable online videos. Each rating participant (rater) was presented with four facial expressions (stimuli), accompanied by simultaneous vocalizations. Two of these were highly intense (pain and pleasure) and two of low intensity (laugh/smile and neutral). Using a Bayesian statistical approach, we could test for consistencies and due-to-chance probabilities. The outcomes support the prediction that the results (ratings) are not due to chance in all cases (so some ratings were not guesses, even though they might have been incorrect) — findings agreeing with the result from unimodal auditory rating but not in agreement with unimodal facial expressions ratings. The highly intense displays are incorrectly attributed. Therefore, we can assume that the auditory information is dominant in terms of certainty of the rating; yet it does not provide extra information for the case of highly intense affective expressions.