Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.375
|View full text |Cite
|
Sign up to set email alerts
|

Sociolectal Analysis of Pretrained Language Models

Abstract: Using data from English cloze tests, in which subjects also self-reported their gender, age, education, and race, we examine performance differences of pretrained language models across demographic groups, defined by these (protected) attributes. We demonstrate wide performance gaps across demographic groups and show that pretrained language models systematically disfavor young non-white male speakers; i.e., not only do pretrained language models learn social biases (stereotypical associations) -pretrained lan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…Additionally, many of the language models used today carry assumptions about the acoustic tones and rhythms found in “typical” conversational speech. This could negatively impact pain patients with speech impediments ( 164 ) or those whose language patterns do not match that of model creators (typically White, English-speaking men) ( 165 , 166 ) 9 , and could exclude or delegitimize the use of lyrics, rhymes, or singing, all of which can also be used to communicate pain ( 167 ).…”
Section: State-of-the-art In Pain Methodsmentioning
confidence: 99%
“…Additionally, many of the language models used today carry assumptions about the acoustic tones and rhythms found in “typical” conversational speech. This could negatively impact pain patients with speech impediments ( 164 ) or those whose language patterns do not match that of model creators (typically White, English-speaking men) ( 165 , 166 ) 9 , and could exclude or delegitimize the use of lyrics, rhymes, or singing, all of which can also be used to communicate pain ( 167 ).…”
Section: State-of-the-art In Pain Methodsmentioning
confidence: 99%
“…Due to these variations, treating "a language" as a homogeneous mass limits cultural adaptation, and runs the risk of privileging certain cultures over others. Zhang et al (2021) find that pretrained language models (PLMs; see §6) reflect certain sociolects more than others. For example, there are considerable morphosyntactic variations between Spanish spoken in Spain and Argentina (Bentivoglio and Sedano, 2011), but they are not considered separately in a Spanish PLM (Cañete et al, 2020).…”
Section: Linguistic Form and Stylementioning
confidence: 99%
“…The common methodology for training machine learning models (e.g., empirical loss minimisation) relies on maximising average performance across training examples (instead of groups, e.g., languages), which often leads to low minority performance, a phenomenon named representation disparity (Hashimoto et al, 2018). Model performance for minorities is often disregarded in favour of majority groups, as shown for race (Blodgett and O'Connor, 2017), gender (Jørgensen and Søgaard, 2021), and age (Zhang et al, 2021). Deriving fair models from biased data is a promising countermeasure (Mehrabi et al, 2021).…”
Section: Model Trainingmentioning
confidence: 99%
“…Regions with smaller populations than others contain fewer online users and are hence underrepresented in the training data. Particularly, Zhang et al (2021) show that most of the word embeddings reflect more of the language habits of European-educated males, neglecting other subsets of the population. This constitutes a biased selection of the population (Hershcovich et al, 2022;Ma et al, 2022) and raises concerns about the non-selected groups' representation within the dataset (Hershcovich et al, 2022;Wolfe and Caliskan, 2021), which will probably cause harms in applications.…”
Section: Introductionmentioning
confidence: 98%