Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-3011
|View full text |Cite
|
Sign up to set email alerts
|

Entity resolution for noisy ASR transcripts

Abstract: Large vocabulary domain-agnostic Automatic Speech Recognition (ASR) systems often mistranscribe domain-specific words and phrases. Since these generic ASR systems are the first component of most voice assistants in production, building Natural Language Understanding (NLU) systems that are robust to these errors can be a challenging task. In this paper, we focus on handling ASR errors in named entities, specifically person names, for a voicebased collaboration assistant. We demonstrate an effective method for r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 13 publications
0
8
0
Order By: Relevance
“…Our work is also closely related to two entity-level tasks: entity correction and entity linking. Entity correction [7,8,9,10] aims to tackle errors occurring in Automatic Speech Recognition (ASR) systems. These studies adopt an "entityto-entity" approach for entity correction, however, we take the query as context, and perform contextualized entity correction.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Our work is also closely related to two entity-level tasks: entity correction and entity linking. Entity correction [7,8,9,10] aims to tackle errors occurring in Automatic Speech Recognition (ASR) systems. These studies adopt an "entityto-entity" approach for entity correction, however, we take the query as context, and perform contextualized entity correction.…”
Section: Related Workmentioning
confidence: 99%
“…It is a two-layer system, retrieving and re-ranks utterances and NLU hypotheses. Since we do not assume the entities are tagged in utterances, "entity-to-entity" approaches [7,8,9,10] are not appropriate baselines. Besides gUFS-QR, we also consider our two-layer entity correction system without the KG components, which is a similar design as popular entity linking system [11], as a baseline.…”
Section: Evaluation Metrics and Baselinementioning
confidence: 99%
“…Errors encountered in either model do not inform the other; in practice this means incorrect ASR transcriptions might be "correctly" interpreted by the NLU, fail to provide the user's desired response. While work is ongoing in detecting [8], quantifying [9,10], and rectifying [11,12] these ASR driven NLU misclassifications, end-to-end (E2E) approaches are a promising way to address this issue.…”
Section: Introductionmentioning
confidence: 99%
“…Noise generator.3.2 Deep MetaphoneRaghuvanshi et al (2019) revealed that relying only on surface text similarities cannot capture phonetic differences between words. Furthermore,Han et al (2013) showed that sound-related features are effective in text normalization.…”
mentioning
confidence: 99%