Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1482
|View full text |Cite
|
Sign up to set email alerts
|

Investigating the Effect of ASR Tuning on Named Entity Recognition

Abstract: Information retrieval from speech is a key technology for many applications, as it allows access to large amounts of audio data. This technology requires two major components: an automatic speech recognizer (ASR) and a text-based information retrieval module such as a key word extractor or a named entity recognizer (NER). When combining the two components, the resulting final application needs to be globally optimized. However, ASR and information retrieval are usually developed and optimized separately. The A… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…For instance, this metric does not distinguish between errors on verbs or proper nouns while such errors do not have the same impact for NER. To compensate this problem, some dedicated metrics to tune ASR systems for better NER performances have been proposed, such as in [3]. Another inconvenience is that usually no information about named entities are used in the ASR process, while such information could help to better choose the partial recognition hypotheses that are dropped away during the decoding process.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, this metric does not distinguish between errors on verbs or proper nouns while such errors do not have the same impact for NER. To compensate this problem, some dedicated metrics to tune ASR systems for better NER performances have been proposed, such as in [3]. Another inconvenience is that usually no information about named entities are used in the ASR process, while such information could help to better choose the partial recognition hypotheses that are dropped away during the decoding process.…”
Section: Introductionmentioning
confidence: 99%
“…Another group of similar tasks are information retrieval from speech [16,17,18,19]. These tasks are either based on a pipeline of several systems or a single E2E system [18].…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, ASR systems are generally tuned measuring word error rate (WER) on a validation corpus, but this metric is not optimal for the subsequent SLU task (semantic parsing, NER, etc.). To compensate this, some specialized metrics to tune ASR have been proposed [2]. However, the number of SLU task applied on the ASR output may be large and considering dedicated metrics for each one of them is not feasible.…”
Section: Introductionmentioning
confidence: 99%