2022
DOI: 10.48550/arxiv.2206.03390
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics

Aylin Caliskan,
Pimparkar Parth Ajay,
Tessa Charlesworth
et al.

Abstract: Word embeddings are numeric representations of meaning derived from word co-occurrence statistics in corpora of human-produced texts. The statistical regularities in language corpora encode wellknown social biases into word embeddings (e.g., the word vector for family is closer to the vector women than to men). Although efforts have been made to mitigate bias in word embeddings, with the hope of improving fairness in downstream Natural Language Processing (NLP) applications, these efforts will remain limited u… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 30 publications
0
1
0
Order By: Relevance
“…One of the important ideas that follow from the Yoneda embedding is the idea of a representable functor. This is a functor between categories F : C → D such that there exists an object c of the category C satisfying F(x) = h c (x) = hom(c, x) for all objects x of C. In the case of word embeddings, if one wishes to know if a certain embedded word c is gender biased one would look at the difference between the cosine distance between the word c and the words man and woman (see [CAC+22]). This can be translated, in the enriched setting, into h c (man) − h c (woman) .…”
Section: Theorem 36 a Neural Network (Viewed As A Parameterized Funct...mentioning
confidence: 99%
“…One of the important ideas that follow from the Yoneda embedding is the idea of a representable functor. This is a functor between categories F : C → D such that there exists an object c of the category C satisfying F(x) = h c (x) = hom(c, x) for all objects x of C. In the case of word embeddings, if one wishes to know if a certain embedded word c is gender biased one would look at the difference between the cosine distance between the word c and the words man and woman (see [CAC+22]). This can be translated, in the enriched setting, into h c (man) − h c (woman) .…”
Section: Theorem 36 a Neural Network (Viewed As A Parameterized Funct...mentioning
confidence: 99%