2020
DOI: 10.1016/j.specom.2020.03.002
|View full text |Cite
|
Sign up to set email alerts
|

A study of continuous space word and sentence representations applied to ASR error detection

Abstract: This paper presents a study of continuous word representations applied to automatic detection of speech recognition errors. A neural network architecture is proposed, which is well suited to handle continuous word representations, like word embeddings. We explore the use of several types of word representations: simple and combined linguistic embeddings, and acoustic ones associated to prosodic features, extracted from the audio signal. To compensate certain phenomena highlighted by the analysis of the error a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 21 publications
(32 reference statements)
0
3
0
Order By: Relevance
“…We developed a coding schema based on previous research on common ASR error types and sources (Errattahi et al 2019;Ghannay, Estève, and Camelin 2020). It distinguishes between error types (e.g., error in verbs, nouns) and error sources (e.g., background noise, pronunciation errors) and assessed whether the content/meaning of the response changed due to the transcription.…”
Section: Coding Proceduresmentioning
confidence: 99%
See 1 more Smart Citation
“…We developed a coding schema based on previous research on common ASR error types and sources (Errattahi et al 2019;Ghannay, Estève, and Camelin 2020). It distinguishes between error types (e.g., error in verbs, nouns) and error sources (e.g., background noise, pronunciation errors) and assessed whether the content/meaning of the response changed due to the transcription.…”
Section: Coding Proceduresmentioning
confidence: 99%
“…However, the transcription and the voice-recording might differ. The accuracy of ASR transcriptions might be low due to longer, shorter, missing, added text, or compound words (Errattahi et al 2019;Ghannay, Estève, and Camelin 2020). A common measure of ASR accuracy is the word error rate (WER) which is the number of transcription errors divided by the answer length (Kim et al 2019;Tancoigne et al 2022).…”
Section: Introductionmentioning
confidence: 99%
“…However, the GANs have difficulties in dealing with discrete data. In natural languages processing, the text sequences [62] are evaluated as the discrete tokens whose values are non-differentiable. Therefore, the optimization of GANs is challenging.…”
Section: ) Generative Adversarial Netmentioning
confidence: 99%