2007
DOI: 10.1109/tasl.2007.894529
|View full text |Cite
|
Sign up to set email alerts
|

Trajectory Clustering for Solving the Trajectory Folding Problem in Automatic Speech Recognition

Abstract: In this paper we introduce a novel method for clustering speech gestures, represented as contin uous trajectories in acoustic parameter space. Trajectory Clustering allows us to avoid the conditional independence assumption that makes it difficult to account for the fact that successive measurements of an articulatory gesture are correlated. We apply the Trajectory Clustering method for developing multiple parallel HMMs for a continuous digits recognition task. We compare the performance obtained with data-dri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2007
2007
2016
2016

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 12 publications
(14 citation statements)
references
References 16 publications
0
14
0
Order By: Relevance
“…Our spectral reduction measure is susceptible to the well-known trajectory folding problem (Han et al, 2007); different tokens taking different trajectories through the acoustic space may end up with identical log-likelihoods, even if their trajectories make very different auditory impres sions. This is yet another reason why it may not be appro priate to map multidimensional acoustic reduction to a real number.…”
Section: Discussionmentioning
confidence: 99%
“…Our spectral reduction measure is susceptible to the well-known trajectory folding problem (Han et al, 2007); different tokens taking different trajectories through the acoustic space may end up with identical log-likelihoods, even if their trajectories make very different auditory impres sions. This is yet another reason why it may not be appro priate to map multidimensional acoustic reduction to a real number.…”
Section: Discussionmentioning
confidence: 99%
“…The method of Han et al [7] does trajectory clustering by using the mixture model for automatic speech recognition. They report on the general problem of different initial cluster assignments leading to different EM clusters.…”
Section: Related Workmentioning
confidence: 99%
“…For each subsequent frame these parameters are predicted with the help of Kalman filter. Unlike the method of Han et al [7], which utilises splitting during initialisation, Xiong et al integrate "split", "merge" and "delete" operation into their dynamic Gaussian mixture model.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, we must conclude that multi-path syllable models, however they may be initialised and trained, may not be the way towards solving the pronunciation variation problem in ASR. Using the acoustic variation in speech as the basis for constructing parametric models of speech (Deng et al, 2006;Han et al, 2007;Zen et al, 2007) will not solve the context modelling problem either. It may well aggravate the problem because it is difficult to link bottom-up acoustic variation to the lexicon and the language model.…”
Section: Directions For Future Researchmentioning
confidence: 99%