2004
DOI: 10.1023/b:ijst.0000037076.86366.8d
|View full text |Cite
|
Sign up to set email alerts
|

Trainable Articulatory Control Models for Visual Speech Synthesis

Abstract: Abstract.This paper deals with the problem of modelling the dynamics of articulation for a parameterised talking head based on phonetic input. Four different models are implemented and trained to reproduce the articulatory patterns of a real speaker, based on a corpus of optical measurements. Two of the models, ("Cohen-Massaro" and "Öhman") are based on coarticulation models from speech production theory and two are based on artificial neural networks, one of which is specially intended for streaming real-time… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
29
2
2

Year Published

2004
2004
2015
2015

Publication Types

Select...
5
2

Relationship

4
3

Authors

Journals

citations
Cited by 32 publications
(34 citation statements)
references
References 27 publications
1
29
2
2
Order By: Relevance
“…The speech files of the 40 sentences were force-aligned using an HMM aligner [29] to guide the talking head lips movement generation procedure [30]. The audio was processed using a 4-channel noise excited vocoder [31] to reduce intelligibility.…”
Section: Methods and Setupmentioning
confidence: 99%
“…The speech files of the 40 sentences were force-aligned using an HMM aligner [29] to guide the talking head lips movement generation procedure [30]. The audio was processed using a 4-channel noise excited vocoder [31] to reduce intelligibility.…”
Section: Methods and Setupmentioning
confidence: 99%
“…The speech files of the 40 sentences were force-aligned using an HMM aligner [29] to guide the talking head lip movement generation procedure [30]. The audio was processed using a 4-channel noise excited vocoder [31] to reduce intelligibility.…”
Section: Methods and Setupmentioning
confidence: 99%
“…As an alternative to the rule-based control model, we have investigated several data-driven (trainable) methods of generating articulatory parameter trajectories to control the face model [12]. The data-driven models are trained on a corpus of articulatory movements recorded from a human speaker, and learn to reproduce the articulatory patterns.…”
Section: Articulatory Control Modelsmentioning
confidence: 99%