Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.132
|View full text |Cite
|
Sign up to set email alerts
|

Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection

Abstract: We present a human-and-model-in-the-loop process for dynamically generating datasets and training better performing and more robust hate detection models. We provide a new dataset of ∼40, 000 entries, generated and labelled by trained annotators over four rounds of dynamic data creation. It includes ∼15, 000 challenging perturbations and each hateful entry has fine-grained labels for the type and target of hate. Hateful entries make up 54% of the dataset, which is substantially higher than comparable datasets.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
70
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 63 publications
(70 citation statements)
references
References 47 publications
0
70
0
Order By: Relevance
“…Our second study focused on collecting ratings for a larger set of posts, but with fewer annota- tors per post to simulate a crowdsourced dataset on toxic language. Drawing from two existing toxic language detection corpora, we select posts that are automatically detected 5 as AAE and/or vulgar from Founta et al ( 2018), and posts that are automatically detected as vulgar and/or annotated as anti-Black from Vidgen et al (2021). 6 Importantly, in this study, we consider anti-Black or AAE posts that could also be vulgar, and allow this vulgarity to cover both potentially offensive identity references (OI) as well as non-identity vulgar words (ONI; see §2.3).…”
Section: Breadth-of-posts Studymentioning
confidence: 99%
See 2 more Smart Citations
“…Our second study focused on collecting ratings for a larger set of posts, but with fewer annota- tors per post to simulate a crowdsourced dataset on toxic language. Drawing from two existing toxic language detection corpora, we select posts that are automatically detected 5 as AAE and/or vulgar from Founta et al ( 2018), and posts that are automatically detected as vulgar and/or annotated as anti-Black from Vidgen et al (2021). 6 Importantly, in this study, we consider anti-Black or AAE posts that could also be vulgar, and allow this vulgarity to cover both potentially offensive identity references (OI) as well as non-identity vulgar words (ONI; see §2.3).…”
Section: Breadth-of-posts Studymentioning
confidence: 99%
“…Anti-Black language denotes racially prejudiced or racist content-subtle (Breitfeller et al, 2019) or overt-which is often a desired target for toxic language detection research (Waseem, 2016;Vidgen et al, 2021). Based on prior work on linking conservative ideologies, endorsement of unrestricted speech, and racial prejudice with reduced likelihood to accept the term "hate speech" (Duckitt and Fisher, 2003;White and Crandall, 2017;Roussos and Dovidio, 2018;Elers and Jayan, 2020), we hypothesize that conservative annotators and those who score highly on the RACISTBELIEFS or FREEOFFSPEECH scales will rate anti-Black tweets as less toxic, and vice-versa.…”
Section: Rated Asmentioning
confidence: 99%
See 1 more Smart Citation
“…Overall, they are moderately aligned with the prescriptive paradigm. Vidgen et al (2021b) annotate hate speech. They provide annotators with fine-grained definitions for each category as well as very detailed annotation guidelines, and disagreements are resolved by an expert.…”
Section: A Overview Of Subjective Task Datasetsmentioning
confidence: 99%
“…In order to reduce this risk, one idea is to clean the harmful responses in the dataset, and the other is to detect the harmfulness of the results output by the model. Both of these can be achieved based on some recent works on offensive speech detection (Ranasinghe and Zampieri, 2020) or hate speech detection (Vidgen et al, 2021).…”
Section: Failure Modesmentioning
confidence: 99%