Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.154
|View full text |Cite
|
Sign up to set email alerts
|

“You Sound Just Like Your Father” Commercial Machine Translation Systems Include Stylistic Biases

Abstract: The main goal of machine translation has been to convey the correct content. Stylistic considerations have been at best secondary. We show that as a consequence, the output of three commercial machine translation systems (Bing, DeepL, Google) make demographically diverse samples from five languages "sound" older and more male than the original. Our findings suggest that translation models reflect demographic bias in the training data. These results open up interesting new research avenues in machine translatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
32
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 40 publications
(34 citation statements)
references
References 9 publications
1
32
0
1
Order By: Relevance
“…To address gender bias in neural network approaches to coreference resolution Rudinger et al (2018); Zhao et al (2018) 2018)). Hovy et al (2020) show that this overamplification consistently makes translations sound older and more male than the original authors.…”
Section: Countermeasuresmentioning
confidence: 75%
“…To address gender bias in neural network approaches to coreference resolution Rudinger et al (2018); Zhao et al (2018) 2018)). Hovy et al (2020) show that this overamplification consistently makes translations sound older and more male than the original authors.…”
Section: Countermeasuresmentioning
confidence: 75%
“…Gender affects myriad aspects of NLP, including corpora, tasks, algorithms, and systems Costa-jussà, 2019;Sun et al, 2019). For example, statistical gender biases are rampant in word embeddings (Jurgens et al, 2012;Bolukbasi et al, 2016;Caliskan et al, 2017;Garg et al, 2018;Zhao et al, 2018b;Basta et al, 2019;Chaloner and Maldonado, 2019;Du et al, 2019;Ethayarajh et al, 2019;Kaneko and Bollegala, 2019;Kurita et al, 2019;-including multilingual ones (Escudé Font and Costa-jussà, 2019;Zhou et al, 2019)-and affect a wide range of downstream tasks including coreference resolution (Zhao et al, 2018a;Cao and Daumé III, 2020;Emami et al, 2019), part-ofspeech and dependency parsing (Garimella et al, 2019), language modeling (Qian et al, 2019;Nangia et al, 2020), appropriate turn-taking classification (Lepp, 2019), relation extraction (Gaut et al, 2020), identification of offensive content (Sharifirad and Matwin, 2019;, and machine translation (Stanovsky et al, 2019;Hovy et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Style itself is a broad concept (Kang and Hovy, 2019). It includes both simple high-level stylistic aspects of language such as verbosity (Marchisio et al, 2019;Agrawal and Carpuat, 2019;Lakew et al, 2019), formality (Niu et al, 2017;Xu et al, 2019), politeness and complex aspects such as demography (Vanmassenhove et al, 2018;Moryossef et al, 2019;Hovy et al, 2020) and personal traits (Mirkin and Meunier, 2015;Rabinovich et al, 2017;Michel and Neubig, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…Following the common practice in evaluating the style imitation (e.g. see (Michel and Neubig, 2018;Hovy et al, 2020)), we train a classifier to predict the translator style of the output of various models. We employ a Logistic Regression classifier based on both uni-gram and bi-gram word features.…”
Section: Style Imitationmentioning
confidence: 99%