Getting Gender Right in Neural Machine Translation

Vanmassenhove, Eva; Hardmeier, Christian; Way, Andy

doi:10.18653/v1/d18-1334

Cited by 139 publications

(147 citation statements)

References 28 publications

Supporting

Mentioning

141

Contrasting

Unclassified

Order By: Relevance

“…For these reasons, the level of human translation has been thought to be the upper bound of the achievable performance 3 . There are also other challenges in recent MT research such as gender bias 4 or unsupervised MT 5 , which are mostly orthogonal to the present work.…”

mentioning

confidence: 99%

Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals

et al. 2020

View full text Add to dashboard Cite

The quality of human translation was long thought to be unattainable for computer translation systems. In this study, we present a deep-learning system, CUBBITT, which challenges this view. In a context-aware blind evaluation by human judges, CUBBITT significantly outperformed professional-agency English-to-Czech news translation in preserving text meaning (translation adequacy). While human translation is still rated as more fluent, CUBBITT is shown to be substantially more fluent than previous state-of-the-art systems. Moreover, most participants of a Translation Turing test struggle to distinguish CUBBITT translations from human translations. This work approaches the quality of human translation and even surpasses it in adequacy in certain circumstances.This suggests that deep learning may have the potential to replace humans in applications where conservation of meaning is the primary aim.

show abstract

mentioning

confidence: 99%

Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Other researchers have developed techniques for mitigating biases in monolingual English NLP tools, with a handful of techniques applied to the more complex problem of inflected languages. Some approaches which have been applied to NMT specifically are effective in limited settings -for example, adding the gender of the speaker as a feature to an NMT system during training can improve translation quality, even though the concept of a single gender per sentence is not appropriate for all translations, and speaker information is not typically available (Vanmassenhove et al 2018). Another approach to gender bias in NLP tools involves training a model with debiased word embeddings, either as a post-processing method (Bolukbasi et al 2016) or from scratch by Zhao et al (2018) for English data, and by Escudé Font and Costa-jussà (2019) for NMT specifically.…”

Section: Gender Bias In Nmt Systemsmentioning

confidence: 99%

The practical ethics of bias reduction in machine translation: why domain adaptation is better than data debiasing

et al. 2021

View full text Add to dashboard Cite

This article probes the practical ethical implications of AI system design by reconsidering the important topic of bias in the datasets used to train autonomous intelligent systems. The discussion draws on recent work concerning behaviour-guiding technologies, and it adopts a cautious form of technological utopianism by assuming it is potentially beneficial for society at large if AI systems are designed to be comparatively free from the biases that characterise human behaviour. However, the argument presented here critiques the common well-intentioned requirement that, in order to achieve this, all such datasets must be debiased prior to training. By focusing specifically on gender-bias in Neural Machine Translation (NMT) systems, three automated strategies for the removal of bias are considered – downsampling, upsampling, and counterfactual augmentation – and it is shown that systems trained on datasets debiased using these approaches all achieve general translation performance that is much worse than a baseline system. In addition, most of them also achieve worse performance in relation to metrics that quantify the degree of gender bias in the system outputs. By contrast, it is shown that the technique of domain adaptation can be effectively deployed to debias existing NMT systems after they have been fully trained. This enables them to produce translations that are quantitatively far less biased when analysed using gender-based metrics, but which also achieve state-of-the-art general performance. It is hoped that the discussion presented here will reinvigorate ongoing debates about how and why bias can be most effectively reduced in state-of-the-art AI systems.

show abstract

“…Yamagishi et al (2016) also use target side annotations during training to control active versus passive voice in the output. Vanmassenhove et al (2018) used prefixed tokens identifying the gender of the author to aid the MT system in correctly presenting gender features in discourse.…”

Section: Related Workmentioning

confidence: 99%

APE through Neural and Statistical MT with Augmented Data. ADAPT/DCU Submission to the WMT 2019 APE Shared Task

Shterionov¹,

Wagner²,

Carmo³

2019

Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

View full text Add to dashboard Cite

Automatic post-editing (APE) can be reduced to a machine translation (MT) task, where the source is the output of a specific MT system and the target is its post-edited variant. However, this approach does not consider context information that can be found in the original source of the MT system. Thus a better approach is to employ multi-source MT, where two input sequences are considered-the original source and the MT output. Extra context information can be introduced in the form of extra tokens that identify certain global properties of a group of segments, added as a prefix or a suffix to each segment. Successfully applied in domain adaptation of MT as well as on APE, this technique deserves further attention. In this work we investigate multi-source neural APE (or NPE) systems with training data which has been augmented with two types of extra context tokens. We experiment with authentic and synthetic data provided by WMT 2019 and submit our results to the APE shared task. We also experiment with using statistical machine translation (SMT) methods for APE. While our systems score bellow the baseline, we consider this work a step towards understanding the added value of extra context in the case of APE.

show abstract

Getting Gender Right in Neural Machine Translation

Cited by 139 publications

References 28 publications

Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals

Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals

The practical ethics of bias reduction in machine translation: why domain adaptation is better than data debiasing

APE through Neural and Statistical MT with Augmented Data. ADAPT/DCU Submission to the WMT 2019 APE Shared Task

Contact Info

Product

Resources

About