2015
DOI: 10.1250/ast.36.467
|View full text |Cite
|
Sign up to set email alerts
|

The use of articulatory movement data in speech synthesis applications: An overview — Application of articulatory movements using machine learning algorithms —

Abstract: This paper describes speech processing work in which articulator movements are used in conjunction with the acoustic speech signal and/or linguistic information. By ''articulator movements,'' we mean the changing positions of human speech articulators such as the tongue and lips, which may be recorded by electromagnetic articulography (EMA), amongst other articulography techniques. Specifically, we provide an overview of: i) inversion mapping techniques, where we estimate articulator movements from a given new… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(18 citation statements)
references
References 61 publications
0
18
0
Order By: Relevance
“…Access to the articulatory domain through imaging techniques such as ultrasound gives additional information over the acoustic domain. Indeed, in addition to the area of speech and language pathology, prior work has shown that articulatory information has the potential to improve performance in multiple aspects of speech technology, for instance: speech recognition [19], speech synthesis [20], and silent speech interfaces [21]. Table 1: The number of participants, their gender and ages.…”
Section: Broader Applicability Of the Datamentioning
confidence: 99%
“…Access to the articulatory domain through imaging techniques such as ultrasound gives additional information over the acoustic domain. Indeed, in addition to the area of speech and language pathology, prior work has shown that articulatory information has the potential to improve performance in multiple aspects of speech technology, for instance: speech recognition [19], speech synthesis [20], and silent speech interfaces [21]. Table 1: The number of participants, their gender and ages.…”
Section: Broader Applicability Of the Datamentioning
confidence: 99%
“…where Θ denotes the parameter set of the GMM which can be estimated from training data using EM algorithm under maximum likelihood criterion, N (•, μ, Σ) denotes a normal distribution with mean vector μ and covariance matrix Σ, M is the number of mixture components, and α m means the weight of the m-th component. At mapping stage, the distribution of acoustic features given input articulatory feature sequence can be derived from (1). Then, the converted acoustic features can be estimated using minimum mean-square error (MMSE) criterion or by maximum likelihood estimation (MLE) [17].…”
Section: Previous Workmentioning
confidence: 99%
“…Therefore, articulatory features and acoustic features are inherently related. Similar to acousticto-articulatory inversion mapping [1], the conversion from articulatory features to acoustic features is also useful in many applications. In speech synthesis, the characteristics of the synthetic speech can be conveniently controlled by manipulating articulatory features [2].…”
Section: Introductionmentioning
confidence: 99%
“…Measurements of articulatory gestures via EMA have found application in several speech processing problems, such as robust speech recognition [7,8], speech synthesis [9,10], and speech modification [3,11]. When vocal tract outline is available, EMA pellet positions can be converted into constriction-based features known as tract variables [12], which are more informative of phonological category [13,14].…”
Section: Related Workmentioning
confidence: 99%