Detecting Harmful Memes and Their Targets

Pramanick, Shraman; Dimitrov, Dimitar; Mukherjee, Rituparna; Sharma, Shivam; Akhtar, Md. Shad; Nakov, Preslav; Chakraborty, Tanmoy

doi:10.18653/v1/2021.findings-acl.246

Cited by 53 publications

(23 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Additionally, the workaround flagging harmful content has focused majorly on text-based features as they are easier to collect. Meanwhile, the usage of memes and videos (short clips and long ones) spreading toxic and harmful content has been gaining momentum [43,63,64]. We need to study the impact of bias in multi-modal content.…”

Section: Case Study: Shift In Bias Due To Knowledge-based Generalizat...mentioning

confidence: 99%

Handling Bias in Toxic Speech Detection: A Survey

Garg¹,

Masud²,

Suresh³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

The massive growth of social media usage has witnessed a tsunami of online toxicity in teams of hate speech, abusive posts, cyberbullying, etc. Detecting online toxicity is challenging due to its inherent subjectivity. Factors such as the context of the speech, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can lead to a sidelining of the various demographic and psychographic groups they aim to help in the first place. It has piqued researchers' interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this paper, we put together a systematic study to discuss the limitations and challenges of existing methods.We start by developing a taxonomy for categorising various unintended biases and a suite of evaluation metrics proposed to quantify such biases. We take a closer look at each proposed method for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of bias shift due to knowledge-based bias mitigation methods. The survey concludes with an overview of the critical challenges, research gaps and future directions.While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.

show abstract

Section: Case Study: Shift In Bias Due To Knowledge-based Generalizat...mentioning

confidence: 99%

Handling Bias in Toxic Speech Detection: A Survey

Garg¹,

Masud²,

Suresh³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Shang et al (2021a) proposed knowledge-enriched graph neural networks that use common-sense knowledge for offensive memes detection. Pramanick et al (2021a) focused on detecting COVID-19related harmful memes and highlighted the challenge posed by the inherent biases within the existing multimodal systems. Pramanick et al (2021b) released another dataset focusing on US Politics and proposed a multimodal framework for harmful meme detection.…”

Section: Related Workmentioning

confidence: 99%

DISARM: Detecting the Victims Targeted by Harmful Memes

Sharma¹,

Akhtar²,

Nakov³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Internet memes have emerged as an increasingly popular means of communication on the Web. Although typically intended to elicit humour, they have been increasingly used to spread hatred, trolling, and cyberbullying, as well as to target specific individuals, communities, or society on political, socio-cultural, and psychological grounds. While previous work has focused on detecting harmful, hateful, and offensive memes, identifying whom they attack remains a challenging and underexplored area. Here we aim to bridge this gap. In particular, we create a dataset where we annotate each meme with its victim(s) such as the name of the targeted person(s), organization(s), and community(ies). We then propose DISARM (Detecting vIctimS targeted by hARmful Memes), a framework that uses named entity recognition and person identification to detect all entities a meme is referring to, and then, incorporates a novel contextualized multimodal deep neural network to classify whether the meme intends to harm these entities. We perform several systematic experiments on three test setups, corresponding to entities that are (a) all seen while training, (b) not seen as a harmful target on training, and (c) not seen at all on training. The evaluation results show that DISARM significantly outperforms ten unimodal and multimodal systems. Finally, we show that DISARM is interpretable and comparatively more generalizable and that it can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.

show abstract

“…Motivation. Internet memes, which are often presented as images with accompanying text, are increasingly abused to spread hatred under the guise of humor [8,13,24]. To fight against the proliferation of hateful memes, Facebook has recently released a large hateful meme dataset and crowdsourced hateful meme classification solutions [13].…”

Section: Introductionmentioning

confidence: 99%

On Explaining Multimodal Hateful Meme Detection Models

Hee

Lee

Chong

2022

Proceedings of the ACM Web Conference 2022

View full text Add to dashboard Cite

Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visualtext slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions. CCS CONCEPTS• Computing methodologies → Natural language processing; Computer vision representations.

show abstract

Detecting Harmful Memes and Their Targets

Cited by 53 publications

References 38 publications

Handling Bias in Toxic Speech Detection: A Survey

Handling Bias in Toxic Speech Detection: A Survey

DISARM: Detecting the Victims Targeted by Harmful Memes

On Explaining Multimodal Hateful Meme Detection Models

Contact Info

Product

Resources

About