Previous studies have found that infants shift their attention from the eyes to the mouth of a talker when they enter the canonical babbling phase after 6 months of age. Here, we investigated whether this increased attentional focus on the mouth is mediated by audio-visual synchrony and linguistic experience. To do so, we tracked eye gaze in 4-, 6-, 8-, 10-, and 12-month-old infants while they were exposed either to desynchronized native or desynchronized non-native audiovisual fluent speech. Results indicated that, regardless of language, desynchronization disrupted the usual pattern of relative attention to the eyes and mouth found in response to synchronized speech at 10 months but not at any other age. These findings show that audio-visual synchrony mediates selective attention to a talker's mouth just prior to the emergence of initial language expertise and that it declines in importance once infants become native-language experts.
To investigate the developmental emergence of the ability to perceive the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8–10, and 12–14 month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor the 8–10 month-old infants exhibited audio-visual matching in that neither group exhibited greater looking at the matching monologue. In contrast, the 12–14 month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, they perceived the multisensory coherence of native-language monologues earlier in the test trials than of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12–14 month olds did not depend on audio-visual synchrony whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audio-visual synchrony cues are more important in the perception of the multisensory coherence of non-native than native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing.
Previous studies have found that when monolingual infants are exposed to a talking face speaking in a native language, 8- and 10-month-olds attend more to the talker's mouth, whereas 12-month-olds no longer do so. It has been hypothesized that the attentional focus on the talker's mouth at 8 and 10 months of age reflects reliance on the highly salient audiovisual (AV) speech cues for the acquisition of basic speech forms and that the subsequent decline of attention to the mouth by 12 months of age reflects the emergence of basic native speech expertise. Here, we investigated whether infants may redeploy their attention to the mouth once they fully enter the word-learning phase. To test this possibility, we recorded eye gaze in monolingual English-learning 14- and 18-month-olds while they saw and heard a talker producing an English or Spanish utterance in either an infant-directed (ID) or adult-directed (AD) manner. Results indicated that the 14-month-olds attended more to the talker's mouth than to the eyes when exposed to the ID utterance and that the 18-month-olds attended more to the talker's mouth when exposed to the ID and the AD utterance. These results show that infants redeploy their attention to a talker's mouth when they enter the word acquisition phase and suggest that infants rely on the greater perceptual salience of redundant AV speech cues to acquire their lexicon.
We tested 4-6- and 10-12-month-old infants to investigate whether the often-reported decline in infant sensitivity to other-race faces may reflect responsiveness to static or dynamic/silent faces rather than a general process of perceptual narrowing. Across three experiments, we tested discrimination of either dynamic own-race or other-race faces which were either accompanied by a speech syllable, no sound, or a non-speech sound. Results indicated that 4-6- and 10-12-month-old infants discriminated own-race as well as other-race faces accompanied by a speech syllable, that only the 10-12-month-olds discriminated silent own-race faces, and that 4-6-month-old infants discriminated own-race and other-race faces accompanied by a non-speech sound but that 10-12-month-old infants only discriminated own-race faces accompanied by a non-speech sound. Overall, the results suggest that the ORE reported to date reflects infant responsiveness to static or dynamic/silent faces rather than a general process of perceptual narrowing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.