Measuring Machine Translation Errors in New Domains

Irvine, Ann; Morgan, John J. B.; Carpuat, Marine; Daumé, Hal; Munteanu, Dragos Stefan

doi:10.1162/tacl_a_00239

Cited by 54 publications

(39 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In light of these apparent success, we examine the failure modes of existing models for morphological generation. We first propose and motivate an error taxonomy for this task, inspired by similar proposals for other natural language generation and processing technologies such as grammatical error correction (e.g., Rozovskaya and Roth 2016) and machine translation (e.g., Popović and Ney 2011, Fishel et al 2012, Irvine et al 2013. We then use this taxonomy to perform a manual error analysis of the CoNLL-SIGMORPHON 2017 Shared Task.…”

Section: Introductionmentioning

confidence: 99%

Weird Inflects but OK: Making Sense of Morphological Generation Errors

Gorman¹,

McCarthy

Cotterell

et al. 2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

We conduct a manual error analysis of the CoNLL-SIGMORPHON 2017 Shared Task on Morphological Reinflection. In this task, systems are given a word in citation form (e.g., hug) and asked to produce the corresponding inflected form (e.g., the simple past hugged). This design lets us analyze errors much like we might analyze children's production errors. We propose an error taxonomy and use it to annotate errors made by the top two systems across twelve languages. Many of the observed errors are related to inflectional patterns sensitive to inherent linguistic properties such as animacy or affect; many others are failures to predict truly unpredictable inflectional behaviors. We also find nearly one quarter of the residual "errors" reflect errors in the gold data.

show abstract

Section: Introductionmentioning

confidence: 99%

Weird Inflects but OK: Making Sense of Morphological Generation Errors

Gorman¹,

McCarthy

Cotterell

et al. 2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

show abstract

“…In the real scenario, training data and test data have different distributions and the target domains are sometimes unseen. Irvine et al (2013) analyze the translation errors in such scenarios. Domain generalization aims to apply knowledge gained from labeled source domains to unseen target domains (Li et al, 2018).…”

Section: Adversarial Domain Adaptation and Domain Generationmentioning

confidence: 99%

A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation

Chu

Dabre

Kurohashi

2018

Journal of Information Processing

View full text Add to dashboard Cite

Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domainspecific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.This work is licensed under a Creative Commons Attribution 4.0 International License. License details:

show abstract

“…There are some studies in the area of SMT evaluation, e.g. those dealing with the errors in translation of new domains (Irvine et al, 2013). However, the error types concern the lexical level only, as the authors operate solely with the notion of domain (field of discourse) and not register (which includes more parameters, see Section 2.1 above).…”

Section: Register In Translationmentioning

confidence: 99%

Measuring ’Registerness’ in Human and Machine Translation: A Text Classification Approach

Lapshinova-Koltunski¹,

Vela²

2015

Proceedings of the Second Workshop on Discourse in Machine Translation

View full text Add to dashboard Cite

In this paper, we apply text classification techniques to prove how well translated texts obey linguistic conventions of the target language measured in terms of registers, which are characterised by particular distributions of lexico-grammatical features according to a given contextual configuration. The classifiers are trained on German original data and tested on comparable English-to-German translations. Our main goal is to see if both human and machine translations comply with the nontranslated target originals. The results of the present analysis provide evidence for our assumption that the usage of parallel corpora in machine translation should be treated with caution, as human translations might be prone to errors.

show abstract

Measuring Machine Translation Errors in New Domains

Cited by 54 publications

References 9 publications

Weird Inflects but OK: Making Sense of Morphological Generation Errors

Weird Inflects but OK: Making Sense of Morphological Generation Errors

A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation

Measuring ’Registerness’ in Human and Machine Translation: A Text Classification Approach

Contact Info

Product

Resources

About