Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.97
|View full text |Cite
|
Sign up to set email alerts
|

Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate

Abstract: Detecting online hate is a complex task, and low-performing models have harmful consequences when used for sensitive applications such as content moderation. Emoji-based hate is an emerging challenge for automated detection. We present HATEMOJICHECK, a test suite of 3,930 short-form statements that allows us to evaluate performance on hateful language expressed with emoji. Using the test suite, we expose weaknesses in existing hate detection models. To address these weaknesses, we create the HATEMOJIBUILD data… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(15 citation statements)
references
References 39 publications
0
15
0
Order By: Relevance
“…Fairness measures were very diverse, including, for example, equalized odds (Wang et al, 2020b), demographic parity (Coston et al, 2020), equal opportunity (Cotter et al, 2019), individual fairness (Black et al, 2020), and calibration by group (Petersen et al, 2023). Capabilities included generalization (Wu et al, 2020), calibration (Hendrycks et al, 2019b), handling of linguistic phenomena (Naik et al, 2018), level of bias (Nangia et al, 2020), reasoning (Liu et al, 2019a), and task-speciĄc capabilities, e.g., recognizing emoji-based hate (Kirk et al, 2022).…”
Section: Quantitative Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Fairness measures were very diverse, including, for example, equalized odds (Wang et al, 2020b), demographic parity (Coston et al, 2020), equal opportunity (Cotter et al, 2019), individual fairness (Black et al, 2020), and calibration by group (Petersen et al, 2023). Capabilities included generalization (Wu et al, 2020), calibration (Hendrycks et al, 2019b), handling of linguistic phenomena (Naik et al, 2018), level of bias (Nangia et al, 2020), reasoning (Liu et al, 2019a), and task-speciĄc capabilities, e.g., recognizing emoji-based hate (Kirk et al, 2022).…”
Section: Quantitative Resultsmentioning
confidence: 99%
“…We say a paper in our survey evaluates a speciĄcation if it measures it. I.e., the paper either proposes a new method of how to evaluate a speciĄcation (e.g., by designing a test suite (Kirk et al, 2022) or a metric (Weng et al, 2018)) or studies a previously proposed speciĄcation as part of the evaluation (in the simplest case just reports its outcome).…”
Section: Discussionmentioning
confidence: 99%
“…Multi-source AL for NLP While AL has been studied for a variety of tasks in NLP (Siddhant and Lipton, 2018;Lowell et al, 2019;Ein-Dor et al, 2020;Shelmanov et al, 2021;Margatina et al, 2021;Yuan et al, 2022;Schröder et al, 2022;Margatina et al, 2022;Kirk et al, 2022;Zhang et al, 2022), the majority of work remains limited to settings where training data is assumed to stem from a single source. Some recent works have sought to address the issues that arise when relaxing the single-source assumption (Ghorbani et al, 2021;, though results remain primarily limited to image classification.…”
Section: Related Workmentioning
confidence: 99%
“…Works such as Prabhakaran et al (2019) and Hutchinson et al (2020) partially mitigate this by using real-world data and targeting specific syntactic slots for substitution, but this can yield incoherent or contradictory text when there are multiple entities referenced in a sentence. Finally, recent works with templates such as and Kirk et al (2021) have been effective at detailing problems with modern toxicity classifiers, by investing significant targeted effort into probing task-specific functionality, and employing human validation for generated examples.…”
Section: Counterfactual Generationmentioning
confidence: 99%