Proceedings of the 2021 International Conference on Multimodal Interaction 2021
DOI: 10.1145/3462244.3479949
|View full text |Cite
|
Sign up to set email alerts
|

An Interpretable Approach to Hateful Meme Detection

Abstract: Hateful memes are an emerging method of spreading hate on the internet, relying on both images and text to convey a hateful message. We take an interpretable approach to hateful meme detection, using machine learning and simple heuristics to identify the features most important to classifying a meme as hateful. In the process, we build a gradient-boosted decision tree and an LSTM-based model that achieve comparable performance (73.8 validation and 72.7 test auROC) to the gold standard of humans and state-of-th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…The detection of Hateful Memes represents a binary classification challenge, seeking to ascertain whether a meme is offensive or hateful through the analysis of multimodal data constituting both text and image signals. Given the inherent complexity of memes, characterized by the coexistence of two modalities, and recognizing the formidable obstacle of detection accuracy in this task, a multi-task learning technique is deemed essential for drawing meaningful statistical conclusions [10][11].…”
Section: Methodsmentioning
confidence: 99%
“…The detection of Hateful Memes represents a binary classification challenge, seeking to ascertain whether a meme is offensive or hateful through the analysis of multimodal data constituting both text and image signals. Given the inherent complexity of memes, characterized by the coexistence of two modalities, and recognizing the formidable obstacle of detection accuracy in this task, a multi-task learning technique is deemed essential for drawing meaningful statistical conclusions [10][11].…”
Section: Methodsmentioning
confidence: 99%
“…Of 8 papers in the geoscience domain, 5 were in classification [347]- [351], 1 was in recognition [352], detection [353], and precision [354]. In total 10 articles were identified in the social media domain, where 4 were in detection [355]- [358], 2 were in analysis [359], [360] and 1 was in classification [361], prediction [362], recommendation [363] and verification [364]. In the vehicle domain, 6 articles were found; from them, 2 were in recognition [365], [366], 1 was in identification [367] and 3 were in detection [368]- [370].…”
Section: Inclusion Criteriamentioning
confidence: 99%
“…The remaining 57 articles were related to combined models of model-based and agnostic-based models. Seven types of the combination were found: early & late [65], [78], [80], [86], [87], [109], [118], [123], [131], [135], [151], [165], [190], [206], [234], [235], [239], [251], [255], [269], [278], [281], [297], [302], [304], [314], [350], [358], [361], [363], [403], [412], [414], early & late & hybrid [74], [117], [232], [260], [263], [284], [331], [354], [370], early & late & kernel [150], [238], early & late & neural networks [60], [67], [183], [288], [409], early & neural networks [81], [226], [375], late & hybrid [309]<...…”
Section: F Fusionmentioning
confidence: 99%
“…Each has its own peculiarities, often related to information extraction techniques, especially from the visual content. Among them, the most used are text removal, object detection, image captioning, and entity recognition, such as in (Deshpande and Mani, 2021), (Pramanick et al, 2021) and (Lee et al, 2021). First of all, we exploit single modalities separately, through state-of-the-art deep models, to investigate their relevance and the quantity of information they carry on.…”
Section: Related Workmentioning
confidence: 99%