Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.250
|View full text |Cite
|
Sign up to set email alerts
|

Human-in-the-Loop for Data Collection: a Multi-Target Counter Narrative Dataset to Fight Online Hate Speech

Abstract: Undermining the impact of hateful content with informed and non-aggressive responses, called counter narratives, has emerged as a possible solution for having healthier online communities. Thus, some NLP studies have started addressing the task of counter narrative generation. Although such studies have made an effort to build hate speech / counter narrative (HS/CN) datasets for neural generation, they fall short in reaching either highquality and/or high-quantity. In this paper, we propose a novel human-in-th… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
46
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 37 publications
(46 citation statements)
references
References 25 publications
0
46
0
Order By: Relevance
“…Mathew et al (2021) are the only ones to mention Caucasian as a target group, which we mark with a '-' to signal that this paper is explicit about not restricting itself to non-dominant groups. Most papers with a ✓for ND specifically define their targets to apply to nondominant groups only (Chung et al, 2019;Basile et al, 2019;Fanton et al, 2021), with the exception of Talat and Hovy (2016), who mention minorities.…”
Section: Overview Of Definitions and Datasetsmentioning
confidence: 99%
“…Mathew et al (2021) are the only ones to mention Caucasian as a target group, which we mark with a '-' to signal that this paper is explicit about not restricting itself to non-dominant groups. Most papers with a ✓for ND specifically define their targets to apply to nondominant groups only (Chung et al, 2019;Basile et al, 2019;Fanton et al, 2021), with the exception of Talat and Hovy (2016), who mention minorities.…”
Section: Overview Of Definitions and Datasetsmentioning
confidence: 99%
“…Adding data [16,45,51,121] Relabeling data [76] Reweighting data [12,64,137] Collecting expert labels [98] Passive observation [69,84,118]…”
Section: Active Data Collectionmentioning
confidence: 99%
“…Data collection may transform D into D by asking the expert to approve the weight placed on [12] or to provide a label for a datapoint [76]. Experts may also review new datapoints, where each new datapoint x is selected by some heuristic (e.g., high uncertainty regions) and the corresponding y is specified by the expert [45,121]. Some work considers collecting multiple labels from various experts for each x [98].…”
Section: Observation To Datasetmentioning
confidence: 99%
“…However the approach was deemed inferior to a fully manual approach due to data quality issues with the synthetic dataset. Finally, Fanton et al [17] propose a data collection method where a language model is iteratively finetuned on human-edited versions of its own generations.…”
Section: Synthetic Dataset Creation and Augmentationmentioning
confidence: 99%