Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1160
|View full text |Cite
|
Sign up to set email alerts
|

Gender-preserving Debiasing for Pre-trained Word Embeddings

Abstract: Word embeddings learnt from massive text collections have demonstrated significant levels of discriminative biases such as gender, racial or ethnic biases, which in turn bias the down-stream NLP applications that use those word embeddings. Taking gender-bias as a working example, we propose a debiasing method that preserves non-discriminative gender-related information, while removing stereotypical discriminative gender biases from pre-trained word embeddings. Specifically, we consider four types of informatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

3
102
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 95 publications
(106 citation statements)
references
References 48 publications
3
102
1
Order By: Relevance
“…Gender affects myriad aspects of NLP, including corpora, tasks, algorithms, and systems Costa-jussà, 2019;Sun et al, 2019). For example, statistical gender biases are rampant in word embeddings (Jurgens et al, 2012;Bolukbasi et al, 2016;Caliskan et al, 2017;Garg et al, 2018;Zhao et al, 2018b;Basta et al, 2019;Chaloner and Maldonado, 2019;Du et al, 2019;Ethayarajh et al, 2019;Kaneko and Bollegala, 2019;Kurita et al, 2019;-including multilingual ones (Escudé Font and Costa-jussà, 2019;Zhou et al, 2019)-and affect a wide range of downstream tasks including coreference resolution (Zhao et al, 2018a;Cao and Daumé III, 2020;Emami et al, 2019), part-ofspeech and dependency parsing (Garimella et al, 2019), language modeling (Qian et al, 2019;Nangia et al, 2020), appropriate turn-taking classification (Lepp, 2019), relation extraction (Gaut et al, 2020), identification of offensive content (Sharifirad and Matwin, 2019;, and machine translation (Stanovsky et al, 2019;Hovy et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Gender affects myriad aspects of NLP, including corpora, tasks, algorithms, and systems Costa-jussà, 2019;Sun et al, 2019). For example, statistical gender biases are rampant in word embeddings (Jurgens et al, 2012;Bolukbasi et al, 2016;Caliskan et al, 2017;Garg et al, 2018;Zhao et al, 2018b;Basta et al, 2019;Chaloner and Maldonado, 2019;Du et al, 2019;Ethayarajh et al, 2019;Kaneko and Bollegala, 2019;Kurita et al, 2019;-including multilingual ones (Escudé Font and Costa-jussà, 2019;Zhou et al, 2019)-and affect a wide range of downstream tasks including coreference resolution (Zhao et al, 2018a;Cao and Daumé III, 2020;Emami et al, 2019), part-ofspeech and dependency parsing (Garimella et al, 2019), language modeling (Qian et al, 2019;Nangia et al, 2020), appropriate turn-taking classification (Lepp, 2019), relation extraction (Gaut et al, 2020), identification of offensive content (Sharifirad and Matwin, 2019;, and machine translation (Stanovsky et al, 2019;Hovy et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…This post-processing operation has been repeatedly proposed in different contexts such as with distributional (counting-based) word representations (Sahlgren et al, 2016) and sentence embeddings (Arora et al, 2017). Independently to the above, autoencoders have been widely used for fine-tuning pre-trained word embeddings such as for removing gender bias (Kaneko and Bollegala, 2019), meta-embedding (Bao and Bollegala, 2018), cross-lingual word embedding (Wei and Deng, 2017) and domain adaptation (Chen et al, 2012), to name a few. However, it is unclear whether better performance is obtained simply by applying an autoencoder (a self-supervised task, requiring no labelled data) on pre-trained word embeddings, without performing any task-specific fine-tuning (requires labelled data for the task).…”
Section: Introductionmentioning
confidence: 99%
“…Recently, the NLP community has focused on exploring gender bias in NLP systems (Sun et al, 2019), uncovering many gender disparities and harmful biases in algorithms and text (Cao and Chang and McKeown 2019;Costa-jussà 2019;Du et al 2019;Emami et al 2019;Garimella et al 2019;Gaut et al 2020;Habash et al 2019;Hashempour 2019;Hoyle et al 2019;Lee et al 2019a;Lepp 2019;Qian 2019;Sharifirad and Matwin 2019;Stanovsky et al 2019;O'Neil 2016;Blodgett et al 2020;Nangia et al 2020). Particular attention has been paid to uncovering, analyzing, and removing gender biases in word embeddings (Basta et al, 2019;Kaneko and Bollegala, 2019;Zhao et al, , 2018bBolukbasi et al, 2016). This word embedding work has even extended to multilingual work on gender-marking Williams et al, 2019;Zhou et al, 2019;.…”
Section: Related Workmentioning
confidence: 99%