2002
DOI: 10.1109/tsa.2002.804304
|View full text |Cite
|
Sign up to set email alerts
|

High-level approaches to confidence estimation in speech recognition

Abstract: Abstract-We describe some high-level approaches to estimating confidence scores for the words output by a speech recognizer. By "high-level" we mean that the proposed measures do not rely on decoder specific "side information" and so should find more general applicability than measures that have been developed for specific recognizers. Our main approach is to attempt to decouple the language modeling and acoustic modeling in the recognizer in order to generate independent information from these two sources tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
30
0

Year Published

2002
2002
2015
2015

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 43 publications
(30 citation statements)
references
References 17 publications
0
30
0
Order By: Relevance
“…Instead, a hidden Markov model (HMM) is constructed for each of the phonemes in the phoneme inventory. We term these HMMs metamodels [40]. The function of a metamodel is best understood by comparison with a "standard" acoustic HMM: a standard acoustic HMM estimates Pr(O | p j ), where O is a subsequence of the complete sequence of observed acoustic vectors in the utterance, O, and p j is a postulated phoneme in P. A metamodel estimates Pr( P | p j ), where P is a subsequence of the complete sequence of observed (decoded) phonemes in the utterance P.…”
Section: First Technique: Metamodelsmentioning
confidence: 99%
“…Instead, a hidden Markov model (HMM) is constructed for each of the phonemes in the phoneme inventory. We term these HMMs metamodels [40]. The function of a metamodel is best understood by comparison with a "standard" acoustic HMM: a standard acoustic HMM estimates Pr(O | p j ), where O is a subsequence of the complete sequence of observed acoustic vectors in the utterance, O, and p j is a postulated phoneme in P. A metamodel estimates Pr( P | p j ), where P is a subsequence of the complete sequence of observed (decoded) phonemes in the utterance P.…”
Section: First Technique: Metamodelsmentioning
confidence: 99%
“…Because the raw acoustic scores are usually not particularly useful as confidence measures when used by themselves [1], methods for normalizing these scores are typically employed [3,8,13]. In this work all of the acoustic scores produced at the phonetic level are normalized against a catch-all model.…”
Section: Phonetic Level Scoringmentioning
confidence: 99%
“…Confidence measures are used on various applications in speech recognition field. Experiments in large vocabulary continuous speech recognition, reported in [1] and [2], show that the use of confidence measure for constructing a word graph significantly increases the recognition performance. In [4] application of confidence measure in language identification task is explained.…”
Section: Introductionmentioning
confidence: 99%