2004
DOI: 10.1016/j.cognition.2004.01.006
|View full text |Cite
|
Sign up to set email alerts
|

Seeing to hear better: evidence for early audio-visual interactions in speech identification

Abstract: Lip reading is the ability to partially understand speech by looking at the speaker's lips. It improves the intelligibility of speech in noise when audio-visual perception is compared with audio-only perception. A recent set of experiments showed that seeing the speaker's lips also enhances sensitivity to acoustic information, decreasing the auditory detection threshold of speech embedded in noise [J. Acoust. Soc. Am. 109 (2001) 2272; J. Acoust. Soc. Am. 108 (2000) 1197]. However, detection is different from c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

10
107
3
1

Year Published

2007
2007
2021
2021

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 281 publications
(121 citation statements)
references
References 18 publications
10
107
3
1
Order By: Relevance
“…Congruent audiovisual words are known to be recognized more easily and the visual information provided by the face and mouth before the onset of the auditory signal contributes to the recognition of the uttered word (e.g. Schwartz et al, 2004;van Wassenhove et al, 2005). However, such processing advantages should facilitate the perception of emotional and neutral words in a similar way.…”
Section: Discussionmentioning
confidence: 96%
See 1 more Smart Citation
“…Congruent audiovisual words are known to be recognized more easily and the visual information provided by the face and mouth before the onset of the auditory signal contributes to the recognition of the uttered word (e.g. Schwartz et al, 2004;van Wassenhove et al, 2005). However, such processing advantages should facilitate the perception of emotional and neutral words in a similar way.…”
Section: Discussionmentioning
confidence: 96%
“…Paulmann and Pell, 2011;Paulmann et al, 2009;see Klasen et al, 2012 for a review) and non-emotional speech signals (e.g. Schwartz et al, 2004;van Wassenhove et al, 2005) and is mandatorily processed (e.g. de Gelder and Vroomen, 2000) already during early perceptual processing stages (e.g.…”
Section: Introductionmentioning
confidence: 98%
“…Although CI users are able to integrate their visuoauditory signal efficiently and compensate for the loss of spectral information, none of the naïve NH subjects listening to CI stimulations reach the same level of VA supraadditive integration. Altogether, we suggest that CI users have developed specific visuoauditory skills that lead to a powerful utilization of the visual spatiotemporal cues (29) provided by the lip and face movements (10), allowing these patients to reach near-perfect performance in visuoauditory situations. Using our computational model that allows us to avoid ceiling effects in subjects' performance, we confirmed that the performance of CI patients derived not only from higher efficiency in speechreading, but also from the acquisition of a higher skill level in multisensory integration when visual speech information is matched to an impoverished auditory signal.…”
Section: Discussionmentioning
confidence: 99%
“…The next question addressed was whether the visual onset cue could be provided by any visible information, even non-speech, or if it was specific to seeing the articulatory gestures through lip movements. Some hints that the effect might be speechspecific are available from previous studies showing that the audibility of speech sounds embedded in noise is improved by seeing coherent lip movements, but that the enhancement is decreased or eliminated if lip movements are replaced by bars going up and down in synchrony with the original lip movements [37,38]. Therefore, in an original experiment reported next, we tested whether the visual onset effect observed by Sato et al [36] would occur when the lip movements of /paaa/ and /taaa/ were replaced by a vertical bar varying in height.…”
Section: Multimodal Nature Of Verbal Transformationsmentioning
confidence: 99%