2022
DOI: 10.48550/arxiv.2206.08978
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards a Deep Multi-layered Dialectal Language Analysis: A Case Study of African-American English

Abstract: Currently, natural language processing (NLP) models proliferate language discrimination leading to potentially harmful societal impacts as a result of biased outcomes. For example, part-of-speech taggers trained on Mainstream American English (MAE) produce noninterpretable results when applied to African American English (AAE) as a result of language features not seen during training. In this work, we incorporate a human-in-the-loop paradigm to gain a better understanding of AAE speakers' behavior and their la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…Recently, there has been a surge in NLP research for AAE. Studies have explored dependency parsing (Blodgett et al, 2018), POS-tagging (Dacon, 2022;Jørgensen et al, 2016), hate speech classification (Harris et al, 2022;Sap et al, 2019), automatic speech recognition (Koenecke et al, 2020;Martin and Tang, 2020), dialectal analysis (Blodgett et al, 2016;Dacon, 2022;Stewart, 2014) and feature detection (Masis et al, 2022;Santiago et al, 2022). Projects such as these rely heavily on large amounts of labeled data, however, little research is dedicated to optimizing the disambiguation and annotation process.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Recently, there has been a surge in NLP research for AAE. Studies have explored dependency parsing (Blodgett et al, 2018), POS-tagging (Dacon, 2022;Jørgensen et al, 2016), hate speech classification (Harris et al, 2022;Sap et al, 2019), automatic speech recognition (Koenecke et al, 2020;Martin and Tang, 2020), dialectal analysis (Blodgett et al, 2016;Dacon, 2022;Stewart, 2014) and feature detection (Masis et al, 2022;Santiago et al, 2022). Projects such as these rely heavily on large amounts of labeled data, however, little research is dedicated to optimizing the disambiguation and annotation process.…”
Section: Related Workmentioning
confidence: 99%
“…Internet corpora, particularly Twitter data, dominate the data used for NLP research on AAE (Deas et al, 2023;Dacon, 2022;Harris et al, 2022;Blodgett et al, 2016Blodgett et al, , 2018Stewart, 2014;Jones, 2015;Jørgensen et al, 2016;Koufakou et al, 2020). AAE is extremely pervasive online, however, it is important to recognize its use in other domains such as everyday speech.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation