2020
DOI: 10.1073/pnas.2002887117
|View full text |Cite
|
Sign up to set email alerts
|

Vision perceptually restores auditory spectral dynamics in speech

Abstract: Visual speech facilitates auditory speech perception, but the visual cues responsible for these benefits and the information they provide remain unclear. Low-level models emphasize basic temporal cues provided by mouth movements, but these impoverished signals may not fully account for the richness of auditory information provided by visual speech. High-level models posit interactions among abstract categorical (i.e., phonemes/visemes) or amodal (e.g., articulatory) speech representations, but require … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
34
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(36 citation statements)
references
References 68 publications
2
34
0
Order By: Relevance
“…While acoustic speech consists of temporal and spectral modulations of sound pressure, visual speech consists of movements of the mouth, head, and hands. Movements of the mouth, lips and tongue in particular provide both redundant and complementary information to acoustic cues ( Hall et al, 2005 ; Peelle and Sommers, 2015 ; Plass et al, 2019 ; Summerfield, 1992 ), and can help to enhance speech intelligibility in noisy environments and in a second language ( Navarra and Soto-Faraco, 2007 ; Sumby and Pollack, 1954 ; Yi et al, 2013 ). While a plethora of studies have investigated the cerebral mechanisms underlying speech in general, we still have a limited understanding of the networks specifically mediating visual speech perception, that is lip reading ( Bernstein and Liebenthal, 2014 ; Capek et al, 2008 ; Crosse et al, 2015 ).…”
Section: Introductionmentioning
confidence: 99%
“…While acoustic speech consists of temporal and spectral modulations of sound pressure, visual speech consists of movements of the mouth, head, and hands. Movements of the mouth, lips and tongue in particular provide both redundant and complementary information to acoustic cues ( Hall et al, 2005 ; Peelle and Sommers, 2015 ; Plass et al, 2019 ; Summerfield, 1992 ), and can help to enhance speech intelligibility in noisy environments and in a second language ( Navarra and Soto-Faraco, 2007 ; Sumby and Pollack, 1954 ; Yi et al, 2013 ). While a plethora of studies have investigated the cerebral mechanisms underlying speech in general, we still have a limited understanding of the networks specifically mediating visual speech perception, that is lip reading ( Bernstein and Liebenthal, 2014 ; Capek et al, 2008 ; Crosse et al, 2015 ).…”
Section: Introductionmentioning
confidence: 99%
“…Here we can show that the brain is also able to perform a more fine-coarsed tracking than initially thought by especially processing the spectral fine-details that are modulated near the lips, another potentially learned association between lip-near auditory cues (i.e. merged F2 and F3 formants) and lip movements (Plass et al, 2020). Additionally, it is not only formants that are subject to visuo-phonological transformation, but also the fundamental frequency, as seen in our results.…”
Section: Discussionmentioning
confidence: 75%
“…Those two formants fluctuate around 2500 Hz and tend to merge into a single peak when pronouncing certain consonant-vowel combinations (Badin et al, 1990). The mentioned merging process is taking place in the front region of the oral cavity and can therefore also be seen by observing lip movements (Plass et al, 2020). The formants were extracted at a rate of 200 Hz for the sake of simplicity and then downsampled to 150 Hz.…”
Section: Extraction Of Lip Area Acoustic Speech Envelope Formants and Pitchmentioning
confidence: 99%
See 2 more Smart Citations