2021
DOI: 10.48550/arxiv.2108.05921
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

4
3

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 0 publications
0
9
0
Order By: Relevance
“…The original HATE-CHECK (Röttger et al, 2021b) then introduced functional tests for hate speech detection models, using hand-crafted test cases to diagnose model weaknesses on different kinds of hate and non-hate. Kirk et al (2021) applied the same framework emoji-based hate. Manerba and Tonelli (2021) provide smaller-scale functional test for abuse detection systems.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The original HATE-CHECK (Röttger et al, 2021b) then introduced functional tests for hate speech detection models, using hand-crafted test cases to diagnose model weaknesses on different kinds of hate and non-hate. Kirk et al (2021) applied the same framework emoji-based hate. Manerba and Tonelli (2021) provide smaller-scale functional test for abuse detection systems.…”
Section: Related Workmentioning
confidence: 99%
“…For these reasons, recent hate speech research has introduced novel test sets and methods that allow for a more targeted evaluation of model functionalities (Calabrese et al, 2021;Kirk et al, 2021;Mathew et al, 2021;Röttger et al, 2021b). However, these novel test sets, like most hate speech datasets so far, focus on English-language content.…”
Section: Introductionmentioning
confidence: 99%
“…Visio-linguistic stress testing. There are a number of existing multimodal stress tests about correctly understanding implausible scenes [13], exploitation of language and vision priors [11,27], single word mismatches [64], hate speech detection [26,32,41,92], memes [39,75], ablation of one modality to probe the other [22], distracting models with visual similarity between images [7,33], distracting models with textual similarity between many suitable captions [1,17], collecting more diverse image-caption pairs beyond the predominately English and North American/Western European datasets [50], probing for an understanding of verb-argument relationships [30], counting [53], or specific model failure modes [65,69]. Many of these stress tests rely only on synthetically generated images, often with minimal visual differences, but no correspondingly minimal textual changes [80].…”
Section: Related Workmentioning
confidence: 99%
“…An alternative is to replace some terms with placeholders, e.g. "that [IDENTITY] is a [SLUR]" or "I hate [IDENTITY]", to convey syntax and some semantics but avoiding actual hate towards a specific target group (Röttger et al, 2021;Kirk et al, 2021a Disclaim Clearly identify the content's origin and thereby disclaim it as an example. For example, political ads should be labelled as "political content" and a distinct visual style should be used.…”
Section: Presenting Textual Harms For Publicationmentioning
confidence: 99%