2012 IEEE Spoken Language Technology Workshop (SLT) 2012
DOI: 10.1109/slt.2012.6424203
|View full text |Cite
|
Sign up to set email alerts
|

Class-based speech recognition using a maximum dissimilarity criterion and a tolerance classification margin

Abstract: One of the difficult problems of Automatic Speech Recognition (ASR) is dealing with the acoustic signal variability. Much state-of-the-art research has demonstrated that splitting data into classes and using a model specific to each class provides better results. However, when the dataset is not large enough and the number of classes increases, there is less data for adapting the class models and the performance degrades. This work extends and combines previous research on unsupervised splits of datasets to bu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2014
2014
2014
2014

Publication Types

Select...
1
1

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 11 publications
(15 reference statements)
0
2
0
Order By: Relevance
“…Increasing the number of classes decreases the number of available training utterances associated with each class. This problem can be partially handled by soft clustering techniques, such as eigenvoice approach, where the parameters of an unknown speaker are determined as a combination of class models [10], or by explicitly enlarging the class data by allowing one utterance to belong to several classes [9,7].…”
Section: Introductionmentioning
confidence: 99%
“…Increasing the number of classes decreases the number of available training utterances associated with each class. This problem can be partially handled by soft clustering techniques, such as eigenvoice approach, where the parameters of an unknown speaker are determined as a combination of class models [10], or by explicitly enlarging the class data by allowing one utterance to belong to several classes [9,7].…”
Section: Introductionmentioning
confidence: 99%
“…With respect to the training process, increasing the number of classes decreases the number of utterances associated with each class. This problem can be partially handled by soft clustering techniques, such as eigenvoice approach, where the parameters of an unknown speaker are determined as a combination of class models [7], or by explicitly enlarging the class-associated data by allowing one utterance to belong to several classes [8,9].…”
Section: Introductionmentioning
confidence: 99%