Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society 2022
DOI: 10.1145/3514094.3534136
|View full text |Cite
|
Sign up to set email alerts
|

American == White in Multimodal Language-and-Image AI

Abstract: Three state-of-the-art language-and-image AI models, CLIP, SLIP, and BLIP, are evaluated for evidence of a bias previously observed in social and experimental psychology: equating American identity with being White. Embedding association tests (EATs) using standardized images of self-identified Asian, Black, Latina/o, and White individuals from the Chicago Face Database (CFD) reveal that White individuals are more associated with collective in-group words than are Asian, Black, or Latina/o individuals, with ef… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(9 citation statements)
references
References 55 publications
0
9
0
Order By: Relevance
“…Recently, work has begun that focuses on particular biases that have emerged in multi-modal language-vision machine learning systems. Wolfe and Caliskan (2022) reported that racial biases about American identity, previously observed in social psychology, are learned by multi-modal embedding models and propagated to downstream tasks. Other works proposed evaluation frameworks to assess biases in text-to-image systems (Cho, Zala, and Bansal 2022) or training mechanisms such as adversarial learning to reduce representation biases in language-vision models (Berg et al 2022).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, work has begun that focuses on particular biases that have emerged in multi-modal language-vision machine learning systems. Wolfe and Caliskan (2022) reported that racial biases about American identity, previously observed in social psychology, are learned by multi-modal embedding models and propagated to downstream tasks. Other works proposed evaluation frameworks to assess biases in text-to-image systems (Cho, Zala, and Bansal 2022) or training mechanisms such as adversarial learning to reduce representation biases in language-vision models (Berg et al 2022).…”
Section: Related Workmentioning
confidence: 99%
“…At the same time, commentators have noted potential ethical issues related to the use of copyrighted artworks in the training sets, the generation of hateful and offensive content, as well as issues of bias and diversity in the model outputs. Relating to the latter, research work has begun to audit the output of such models, investigating stereotypical associations between occupations and particular races and genders (Cho, Zala, and Bansal 2022), as well as between the word "American" and lighter skin colours (Wolfe and Caliskan 2022).…”
Section: Introductionmentioning
confidence: 99%
“…Although there has been research on fairness in multimodal contexts (Wolfe and Caliskan, 2022;Wolfe et al, 2022), in a first-of-its-kind study, Wang et al (2022) looks at fairness from a multilingual view in multimodal representations. Whilst they find that multimodal representations may be individually fair, i.e., similar text representations across languages translate to similar images, this concept of fairness does not extend across multiple groups.…”
Section: An Outline Of Fairness Evaluation In the Context Of Multilin...mentioning
confidence: 99%
“…Research has likewise uncovered extensive evidence of bias in vision-language AI models trained on internet-scale data [29,145,146]. Data sets used to train vision-language models have been found to contain "troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content" [32].…”
Section: Bias In Vision-language Modelsmentioning
confidence: 99%