Machine Learning for Multimodal Interaction
DOI: 10.1007/978-3-540-78155-4_18
|View full text |Cite
|
Sign up to set email alerts
|

Posterior-Based Features and Distances in Template Matching for Speech Recognition

Abstract: Abstract. The use of large speech corpora in example-based approaches for speech recognition is mainly focused on increasing the number of examples. This strategy presents some difficulties because databases may not provide enough examples for some rare words. In this paper we present a different method to incorporate the information contained in such corpora in these examplebased systems. A multilayer perceptron is trained on these databases to estimate speaker and task-independent phoneme posterior probabili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
18
0

Publication Types

Select...
5

Relationship

2
3

Authors

Journals

citations
Cited by 22 publications
(20 citation statements)
references
References 17 publications
2
18
0
Order By: Relevance
“…This approach is based on TM where speech features are phoneme posterior estimates. In this paper, we confirm the suitability of applying KL divergence when using posterior features as observed in previous experiments [5]. We also show that a weighted combination of the KL divergence can further improve the accuracy.…”
Section: Discussionsupporting
confidence: 87%
See 2 more Smart Citations
“…This approach is based on TM where speech features are phoneme posterior estimates. In this paper, we confirm the suitability of applying KL divergence when using posterior features as observed in previous experiments [5]. We also show that a weighted combination of the KL divergence can further improve the accuracy.…”
Section: Discussionsupporting
confidence: 87%
“…previous experiments [5] have shown that the use of the KL divergence can yield better performance when using posterior features. In the next section, we describe the local distances used in this work.…”
Section: Template Matching Approachmentioning
confidence: 93%
See 1 more Smart Citation
“…We also show that hybrid HMM/ANN system can be interpreted as a particular case of this model where state target distributions are fixed and equal to a delta distribution. Furthermore, this system naturally extends our previous work where we successfully applied posterior features and KL-divergence to the template matching approach for ASR [8].…”
Section: Introductionsupporting
confidence: 62%
“…Acoustic features, such as the melscale frequency cepstral coefficients (MFCCs), are highly sensitive to speaker and channel variations. On the other hand, posteriorbased features, such as the phonetic posteriorgram, are more robust and also widely used in speech recognition [19][20][21]. Therefore, we used the posterior feature to measure the acoustic distance between OOV candidates.…”
Section: Acoustic Distancementioning
confidence: 99%