Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.197
|View full text |Cite
|
Sign up to set email alerts
|

Gender Bias in Masked Language Models for Multiple Languages

Abstract: Masked Language Models (MLMs) pre-trained by predicting masked tokens on large corpora have been used successfully in natural language processing tasks for a variety of languages. Unfortunately, it was reported that MLMs also learn discriminative biases regarding attributes such as gender and race. Because most studies have focused on MLMs in English, the bias of MLMs in other languages has rarely been investigated. Manual annotation of evaluation data for languages other than English has been challenging due … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(15 citation statements)
references
References 55 publications
0
15
0
Order By: Relevance
“…In this paper, we limited our investigation to English PLMs. However, as reported in a lot of previous work, social biases are language independent and omnipresent in PLMs trained for many languages (Kaneko et al, 2022c;Lewis and Lupyan, 2020;Liang et al, 2020;Zhao et al, 2020). We plan to extend this study to cover non-English PLMs in the future.…”
Section: Limitationsmentioning
confidence: 77%
See 2 more Smart Citations
“…In this paper, we limited our investigation to English PLMs. However, as reported in a lot of previous work, social biases are language independent and omnipresent in PLMs trained for many languages (Kaneko et al, 2022c;Lewis and Lupyan, 2020;Liang et al, 2020;Zhao et al, 2020). We plan to extend this study to cover non-English PLMs in the future.…”
Section: Limitationsmentioning
confidence: 77%
“…The imbalance of gender words in the training data affects the gender bias of a PLM fine-tuned using that data Kaneko et al, 2022c). Using this fact, we propose a method to learn bias-controlled versions of PLMs that express different levels of known gender biases.…”
Section: Bias-controlled Fine-tuningmentioning
confidence: 99%
See 1 more Smart Citation
“…Milios and BehnamGhader (2022); España-Bonet and Barrón-Cedeño (2022) illustrate the inefficiency of direct translation methods, and España-Bonet and Barrón-Cedeño (2022) advocates for the creation of culturally-sensitive datasets for fairness assessment. However, Kaneko et al (2022) proposes a way to generate parallel corpora for other languages that bears high correlation with human bias annotations.…”
Section: Grammatically Gendered Languagesmentioning
confidence: 99%
“…This is particularly relevant, as in a practical setting, treating identities as composites of various demographic attributes is a necessity. Kaneko et al (2022) measures gender bias in masked language models and proposes a method to use parallel corpora to evaluate bias in languages shown to have high correlations with human bias annotations. In cases where manually annotated data doesn't exist, this could prove helpful.…”
Section: An Outline Of Fairness Evaluation In the Context Of Multilin...mentioning
confidence: 99%