2011
DOI: 10.1007/978-3-642-18184-9_6
|View full text |Cite
|
Sign up to set email alerts
|

Audio-Visual Prosody: Perception, Detection, and Synthesis of Prominence

Abstract: Abstract. In this chapter, we investigate the effects of facial prominence cues, in terms of gestures, when synthesized on animated talking heads. In the first study a speech intelligibility experiment is conducted, where speech quality is acoustically degraded, then the speech is presented to 12 subjects through a lip synchronized talking head carrying head-nods and eyebrow raising gestures. The experiment shows that perceiving visual prominence as gestures, synchronized with the auditory prominence, signific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…From a functional point of view, gestures convey meanings that can be referential (representing an entity or event deictically, iconically, or metaphorically) and non‐referential (signaling information structure, modal information, or discourse cohesion) (Kendon, 1980; McNeill, 2000; and many others thereafter). In the context of focus marking, head gestures are a specific type of body movements that serve a non‐referential meaning and can indicate focus quite consistently (Ambrazaitis & House, 2017; Esteve‐Gibert et al., 2017; Ishi et al., 2014), together with other movements such as eyebrow raising (Cavé et al., 1996; Dohen et al., 2006; Moubayed & Beskow, 2011) and manual beats (Roustan & Dohen, 2010).…”
Section: Introductionmentioning
confidence: 99%
“…From a functional point of view, gestures convey meanings that can be referential (representing an entity or event deictically, iconically, or metaphorically) and non‐referential (signaling information structure, modal information, or discourse cohesion) (Kendon, 1980; McNeill, 2000; and many others thereafter). In the context of focus marking, head gestures are a specific type of body movements that serve a non‐referential meaning and can indicate focus quite consistently (Ambrazaitis & House, 2017; Esteve‐Gibert et al., 2017; Ishi et al., 2014), together with other movements such as eyebrow raising (Cavé et al., 1996; Dohen et al., 2006; Moubayed & Beskow, 2011) and manual beats (Roustan & Dohen, 2010).…”
Section: Introductionmentioning
confidence: 99%
“…In a perception experiment involving computer synthesized animations, [14] and [15] showed that when headnods and eyebrow raises accompany prominent syllables, they can aid speech perception. On the other hand, it is not clear to 9th International Conference on Speech Prosody 2018 13-16 June 2018, Poznań, Poland them whether they will aid or hinder speech perception when they accompany non-prominent syllables.…”
Section: Theoretical Backgroundmentioning
confidence: 99%
“…Those acoustic differences which are perceptually salient could be primarely exploited by an automatic speech recognition [1,2] and understanding system [3]. Such knowledge appears to be also important for other areas of application: in computer-based pronunciation training, speech annotation for data-driven speech synthesis [4], speech summarization [5], speech comprehension for example through improving the syntactic parsing [6], or in audio-visual speech applications by driving gestures for avatars [7].…”
Section: Introductionmentioning
confidence: 99%