Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
DOI: 10.1109/icslp.1996.607807
|View full text |Cite
|
Sign up to set email alerts
|

A compact model for speaker-adaptive training

Abstract: In this work we formulate a novel approach to estimating the parameters of continuous density HMMs for speaker-independent (SI) continuous speech recognition. It is motivated by the fact that variability in SI acoustic models is attributed to both phonetic variation and variation among the speakers of the training population, that is independent of the information content of the speech signal. These two variation sources are decoupled and the proposed method jointly annihilates the inter-speaker variation and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
295
0

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 393 publications
(296 citation statements)
references
References 8 publications
1
295
0
Order By: Relevance
“…In the original formulation (Anastasakos et al, 1996) it involved maximum likelihood linear regression (MLLR) adaptation of the means of output distributions of continuous density HMMs. The resulting HMMs exhibit usually smaller variances and lead to significantly higher likelihood.…”
Section: Satmentioning
confidence: 99%
See 1 more Smart Citation
“…In the original formulation (Anastasakos et al, 1996) it involved maximum likelihood linear regression (MLLR) adaptation of the means of output distributions of continuous density HMMs. The resulting HMMs exhibit usually smaller variances and lead to significantly higher likelihood.…”
Section: Satmentioning
confidence: 99%
“…As a more general solution, to tackle inter-speaker spectral variability in children's speech when training on speech from children of all ages, speaker adaptive acoustic modeling methods can be adopted (Giuliani and Gerosa, 2003;Hagen et al, 2004;Giuliani et al, 2006). In this work, we investigated the use of speaker adaptive acoustic modeling methods, such as vocal tract length normalization (VTLN) (Wegmann et al, 1996;Lee and Rose, 1996;Eide and Gish, 1996), constrained MLLR based speaker normalization (CMLSN) (Giuliani et al, 2006), speaker adaptive training (SAT) (Anastasakos et al, 1996;Gales, 1998) and their combinations. These methods proved to be effective in reducing inter-speaker variability and improved recognition performance on children's speech both in matched conditions, that is training and testing on Italian children aged 7-13, and in unmatched conditions, that is testing on children's speech with models trained on adult speech.…”
Section: Introductionmentioning
confidence: 99%
“…Each constructed transformation W n corresponds to speaker characteristics of a pseudo-speaker n. We reverse the SAT technique [4] by applying the transformation to the normalized speech features to obtain a variety of speakers. We first apply the normalization matrix for training speaker i, W (−1) i , to the speech features of speaker i and then apply the constructed transformation, W n , to them to generate the speech features of pseudo-speaker n:…”
Section: Speech Feature Generation By Mllr Transformations For Pseudomentioning
confidence: 99%
“…Speaker adaptive training (SAT) [4] has also been proManuscript received December 26, 2011. Manuscript revised April 24, 2012.…”
Section: Introductionmentioning
confidence: 99%
“…Current state-of-the-art speech recognition systems are trained on large amounts of speech data, typically collected in a range of acoustic conditions. It is also expected that collecting data in conditions related to the actual target application conditions should improve performance.To address this problem adaptive training has been proposed [5], [6], [7], [8]. These schemes make use of adaptation/compensation transformations during training.…”
mentioning
confidence: 99%