2022
DOI: 10.1371/journal.pcbi.1010273
|View full text |Cite
|
Sign up to set email alerts
|

Modulation transfer functions for audiovisual speech

Abstract: Temporal synchrony between facial motion and acoustic modulations is a hallmark feature of audiovisual speech. The moving face and mouth during natural speech is known to be correlated with low-frequency acoustic envelope fluctuations (below 10 Hz), but the precise rates at which envelope information is synchronized with motion in different parts of the face are less clear. Here, we used regularized canonical correlation analysis (rCCA) to learn speech envelope filters whose outputs correlate with motion in di… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 104 publications
0
3
0
Order By: Relevance
“…Under such more or less improved measurement conditions a whole host of avenues for further research opens up. This would not only allow for a more fine-grained analysis of gesture–speech coupling (Alviar et al, 2020; Krivokapic et al, 2017; Pedersen et al, 2022; Wagner et al, 2014). It could also expand on research by relating to studies showing that gesture kinematics change depending on the communicative context (Trujillo et al, 2018) and psychological background (Trujillo, Özyürek, et al, 2021), and that such modulation of kinematics can affect smooth conversational turn-taking (Trujillo, Levinson, & Holler, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…Under such more or less improved measurement conditions a whole host of avenues for further research opens up. This would not only allow for a more fine-grained analysis of gesture–speech coupling (Alviar et al, 2020; Krivokapic et al, 2017; Pedersen et al, 2022; Wagner et al, 2014). It could also expand on research by relating to studies showing that gesture kinematics change depending on the communicative context (Trujillo et al, 2018) and psychological background (Trujillo, Özyürek, et al, 2021), and that such modulation of kinematics can affect smooth conversational turn-taking (Trujillo, Levinson, & Holler, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…The system could potentially fail to generalize to this mismatching stimulus condition. Alternatively, the visual face may be correlated with audio envelope information (O’Sullivan et al, 2017a; Pedersen et al, 2022), and it may be easier to focus auditory attention on a speaker that can be seen, which may in turn improve decoding (O’Sullivan et al, 2013). Potential visual benefits are important to investigate, for instance in the case were the user wants to switch attention to a previously ignored speaker.…”
Section: Demonstrationsmentioning
confidence: 99%
“…From this point of view, the observed motion of the articulators (visual speech) provides a direct source of information about speech rhythm. Indeed, clearly observable peri-oral movements are significantly correlated with fluctuations in the speech envelope at rhythm rates that peak, roughly, at the syllable rate of 3 to 4 Hz [ 19 ]. Thus, for current purposes we propose that speech-rhythm information is conveyed by the motion of jaw opening and closing.…”
Section: Introductionmentioning
confidence: 99%