Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1241
|View full text |Cite
|
Sign up to set email alerts
|

Improving ASR Confidence Scores for Alexa Using Acoustic and Hypothesis Embeddings

Abstract: In automatic speech recognition, confidence measures provide a quantitative representation used to assess whether a generated hypothesis text is correct or not. For personal assistant devices like Alexa, automatic speech recognition (ASR) errors are inevitable due to the imperfection of today's speech recognition technology. Hence, confidence scores provide an important metric to gauge the correctness of ASR hypothesis text and enable downstream consumers to subsequently initiate appropriate actions. In this w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(15 citation statements)
references
References 15 publications
0
15
0
Order By: Relevance
“…They can be divided into three categories a) utterance verification methods, b) posterior probabilities based methods, and c) methods combining predictor features. Among these techniques, predictor feature-based methods have become popular [8,9]. The objective is to train a classifier using various combination of features from speech recognition model [10,11,12].…”
Section: Relation To Prior Workmentioning
confidence: 99%
See 2 more Smart Citations
“…They can be divided into three categories a) utterance verification methods, b) posterior probabilities based methods, and c) methods combining predictor features. Among these techniques, predictor feature-based methods have become popular [8,9]. The objective is to train a classifier using various combination of features from speech recognition model [10,11,12].…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…We gather these features as a representative of acoustic summary of input. [8] shows that acoustic embedding plays an important role for measuring confidence score. Although joint network uses these features for predictions but [14] show that explicitly passing these features to NCM Model gives significant boost to the model performance.…”
Section: Transcription Network Output (Trans)mentioning
confidence: 99%
See 1 more Smart Citation
“…It is critical to not only achieve a high wake‐up rate, but also suppress false alarms in the KWS system, necessitating the confidence score estimation. Traditional approaches primarily calculate the confidence from n‐best hypothesis or lattice, relying heavily on the accuracy of the posterior probability [9]. For better performance, neural confidence estimation methods are drawing wide research interests to date.…”
Section: Introduction: Open‐vocabularymentioning
confidence: 99%
“…The proposed rescoring method is closely related to confidence estimation, or the ASR error detection task. Confidence estimation assesses the quality of ASR predictions [18,19,20,21,22,23,24,25], which is useful for many downstream ASR applications such as voice assistants. We demonstrate that our models for rescoring can be applied to confidence estimation without any additional architectural changes or training.…”
Section: Introductionmentioning
confidence: 99%