Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics 2014
DOI: 10.3115/v1/e14-1065
|View full text |Cite
|
Sign up to set email alerts
|

Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation

Abstract: We propose a novel technique for adapting text-based statistical machine translation to deal with input from automatic speech recognition in spoken language translation tasks. We simulate likely misrecognition errors using only a source language pronunciation dictionary and language model (i.e., without an acoustic model), and use these to augment the phrase table of a standard MT system. The augmented system can thus recover from recognition errors during decoding using synthesized phrases. Using the outputs … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(16 citation statements)
references
References 31 publications
0
16
0
Order By: Relevance
“…Another promising idea was to limit the detrimental effects of early decisions, rather than attempting to avoid early decisions. One way of achieving this is to train robust translation models by introducing synthetic ASR errors into the source side of MT corpora (Peitz et al, 2012;Tsvetkov et al, 2014;Ruiz et al, 2015;Sperber et al, 2017b;Cheng et al, 2018Cheng et al, , 2019. A different route is taken by Dixon et al (2011); He et al (2011) who directly optimize ASR outputs towards translation quality.…”
Section: Toward Tight Integrationmentioning
confidence: 99%
“…Another promising idea was to limit the detrimental effects of early decisions, rather than attempting to avoid early decisions. One way of achieving this is to train robust translation models by introducing synthetic ASR errors into the source side of MT corpora (Peitz et al, 2012;Tsvetkov et al, 2014;Ruiz et al, 2015;Sperber et al, 2017b;Cheng et al, 2018Cheng et al, , 2019. A different route is taken by Dixon et al (2011); He et al (2011) who directly optimize ASR outputs towards translation quality.…”
Section: Toward Tight Integrationmentioning
confidence: 99%
“…Despite its simplicity, this approach inevitably suffers from mistakes made by ASR models, and is error prone. Research in this direction often focuses on strategies capable of mitigating the mismatch between ASR output and MT input, such as representing ASR outputs with lattices (Saleem et al, 2004;Mathias and Byrne, 2006;Beck et al, 2019), injecting synthetic ASR errors for robust MT (Tsvetkov et al, 2014;Cheng et al, 2018) and differentiable cascade modeling (Kano et al, 2017;Anastasopoulos and Chiang, 2018;Sperber et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Based on the error statistics provided above and recorded in Table 3, we identify the following error types as interesting to focus on when constructing models to cope with ASR errors. 16.2%(±0.6%) of the ASR errors are either insertions or deletions on closed class words. These types are also ranked highly in our SLT experiments.…”
Section: Discussionmentioning
confidence: 99%