2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015
DOI: 10.1109/icassp.2015.7178796
|View full text |Cite
|
Sign up to set email alerts
|

ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks

Abstract: Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied for the first time to error detection in automatic speech recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection and error type classification. We also estimate recognition rates from the error type classification result… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(22 citation statements)
references
References 26 publications
0
22
0
Order By: Relevance
“…Based on the distribution of short edit and capitalization subcategories, we estimate that roughly 80% are amenable to automatic detection algorithms that could be used in an enhanced editing tool to alert physicians to spans of text to check, optionally with proposed corrections. Advances in NLP algorithms for ASR error detection, [30][31][32] disfluency detection, 33 sentence segmentation, 34 true casing, 35 and entity recognition 36 are relevant here. Such algorithms also benefit from incorporating additional resources, such as patient data within the EHR and biomedical knowledge sources, as shown for edit detection.…”
Section: Discussionmentioning
confidence: 99%
“…Based on the distribution of short edit and capitalization subcategories, we estimate that roughly 80% are amenable to automatic detection algorithms that could be used in an enhanced editing tool to alert physicians to spans of text to check, optionally with proposed corrections. Advances in NLP algorithms for ASR error detection, [30][31][32] disfluency detection, 33 sentence segmentation, 34 true casing, 35 and entity recognition 36 are relevant here. Such algorithms also benefit from incorporating additional resources, such as patient data within the EHR and biomedical knowledge sources, as shown for edit detection.…”
Section: Discussionmentioning
confidence: 99%
“…Since some neural architectures showed recently to be effective to process sequence to sequence tasks [46], it could be interesting to compare the neural approach used until now in our experiments to measure the impact of continuous representations to the use of a bidirectional LSTM architecture. Such an architecture is designed to learn how to integrate relevant long distant information, and was successfully used for the ASR error detection task in [6,7]. In our experiments, the bidirectional LSTM architecture is composed of two hidden layers of 512 hidden units each, i.e.…”
Section: Comparison To Bidirectional Lstm Systemmentioning
confidence: 99%
“…In [5], authors propose to use a neural network classifier furnished by stacked auto-encoders (SAE), that helps to learn the error word representations. In [6,7], the authors investigated three types of ASR error detection tasks, e.g. confidence estimation, out-of-vocabulary word detection and error type classification (insertion, substitution or deletion), based on deep bidirectional recurrent neural networks.…”
Section: Introductionmentioning
confidence: 99%
“…The latter two methods show slightly superior performance but higher computational complexity compared to the first one. More recently [4], new features and bidirectional recurrent neural networks (RNN) have been proposed for ASR error detection. Most SLU systems reviewed in [5] generate hypotheses of semantic frame slot tags expressed in a spoken sentence analyzed by an ASR system.…”
Section: Related Workmentioning
confidence: 99%