Interspeech 2010 2010
DOI: 10.21437/interspeech.2010-204
|View full text |Cite
|
Sign up to set email alerts
|

Setup for acoustic-visual speech synthesis by concatenating bimodal units

Abstract: This paper presents preliminary work on building a system able to synthesize concurrently the speech signal and a 3D animation of the speaker's face. This is done by concatenating bimodal diphone units, that is, units that comprise both acoustic and visual information. The latter is acquired using a stereovision technique. The proposed method addresses the problems of asynchrony and incoherence inherent in classic approaches to audiovisual synthesis. Unit selection is based on classic target and join costs fro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2011
2011
2011
2011

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 6 publications
0
1
0
Order By: Relevance
“…We have been working on talking head synthesis by concatenating units composed of bimodal acoustic-visual information [1]. One of the future goals of our project is to enhance our facial animation with an animation of the tongue.…”
Section: Introductionmentioning
confidence: 99%
“…We have been working on talking head synthesis by concatenating units composed of bimodal acoustic-visual information [1]. One of the future goals of our project is to enhance our facial animation with an animation of the tongue.…”
Section: Introductionmentioning
confidence: 99%