HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

Mathew, Binny; Saha, Punyajoy; Yimam, Seid Muhie; Biemann, Chris; Goyal, Pawan; Mukherjee, Animesh

doi:10.48550/arxiv.2012.10289

Cited by 32 publications

(68 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hate speech detection has branched into several sub-tasks like toxic span extraction [30,31], rationale identification [32] and hate target identification [20]. Though recent advancement in the field of NLP has pushed the limits of hate speech identification, like transformers [25] and graph neural networks [33,25,34] with people attempting to induce external knowledge leveraging author profiling [25] or ideology [35] but using context of the conversation is still a challenge with very little work exploring this problem.…”

Section: Related Workmentioning

confidence: 99%

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

Farooqi¹,

Ghosh²,

Shah³

2021

Preprint

View full text Add to dashboard Cite

In the current era of the internet, where social media platforms are easily accessible for everyone, people often have to deal with threats, identity attacks, hate, and bullying due to their association with a cast, creed, gender, religion, or even acceptance or rejection of a notion. Existing works in hate speech detection primarily focus on individual comment classification as a sequence labelling task and often fail to consider the context of the conversation. The context of a conversation often plays a substantial role when determining the author's intent and sentiment behind the tweet. This paper describes the system proposed by team MIDAS-IIITD for HASOC 2021 subtask 2, one of the first shared tasks focusing on detecting hate speech from Hindi-English code-mixed conversations on Twitter. We approach this problem using neural networks, leveraging the transformer's cross-lingual embeddings and further finetuning them for low-resource hate-speech classification in transliterated Hindi text. Our best performing system, a hard voting ensemble of Indic-BERT, XLM-RoBERTa, and Multilingual BERT, achieved a macro F1 score of 0.7253, placing us 1 𝑠𝑡 on the overall leaderboard standings.

show abstract

Section: Related Workmentioning

confidence: 99%

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

Farooqi¹,

Ghosh²,

Shah³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The datasets selected for the study are the HateXplain [17], Social Bias Inference Corpus (SBIC) [21], and the Jigsaw 1 datasets. Our selection of these three datasets is founded on the basis that they address a similar problem (toxic text), yet they are diverse in how the annotations were collected.…”

Section: Datamentioning

confidence: 99%

“…To navigate the challenges with existing datasets, several studies have suggested alternative approaches to annotation tasks and model development. For example, Matthew et al [17] posit that training models by highlighting the portion of a particular text that people use to distinguish offensive text from normal text can improve model performance. Also, Sap et al [20] show that priming annotators before annotation tasks can reduce their insensitivity to different dialects and the occurrence of bias in ground-truth labels.…”

Section: Introductionmentioning

confidence: 99%

Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

Arhin¹,

Baldini²,

Wei³

et al. 2021

Preprint

View full text Add to dashboard Cite

The use of machine learning (ML)-based language models (LMs) to monitor content online is on the rise. For toxic text identification, task-specific fine-tuning of these models are performed using datasets labeled by annotators who provide ground-truth labels in an effort to distinguish between offensive and normal content. These projects have led to the development, improvement, and expansion of large datasets over time, and have contributed immensely to research on natural language. Despite the achievements, existing evidence suggests that ML models built on these datasets do not always result in desirable outcomes. Therefore, using a design science research (DSR) approach, this study examines selected toxic text datasets with the goal of shedding light on some of the inherent issues and contributing to discussions on navigating these challenges for existing and future projects. To achieve the goal of the study, we re-annotate samples from three toxic text datasets and find that a multi-label approach to annotating toxic text samples can help to improve dataset quality. While this approach may not improve the traditional metric of inter-annotator agreement, it may better capture dependence on context and diversity in annotators. We discuss the implications of these results for both theory and practice.CCS Concepts: • Applied computing → Annotation.

show abstract

“…-Hate Speech: During the discussion of important events, some users can behave aggressively and even use hate speech towards other individuals or groups of people. We apply RoBERTa [23] fine-tuned for the task of detecting hate speech and offensive language [25] to the tweet contents. Text filtering and cleaning are applied as in sentiment analysis.…”

Section: Experimental Designmentioning

confidence: 99%

BlackLivesMatter 2020: An Analysis of Deleted and Suspended Users in Twitter

Toraman,

Şahinuç,

Yilmaz

2021

Preprint

View full text Add to dashboard Cite

After George Floyd's death in May 2020, the volume of discussion in social media increased dramatically. A series of protests followed this tragic event, called as the 2020 BlackLivesMatter (BLM) movement. People participated in the discussion for several reasons; including protesting and advocacy, as well as spamming and misinformation spread. Eventually, many user accounts are deleted by their owners or suspended due to violating the rules of social media platforms. In this study, we analyze what happened in Twitter before and after the event triggers. We create a novel dataset that includes approximately 500k users sharing 20m tweets, half of whom actively participated in the 2020 BLM discussion, but some of them were deleted or suspended later. We have the following research questions: What differences exist (i) between the users who did participate and not participate in the BLM discussion, and (ii) between old and new users who participated in the discussion? And, (iii) why are users deleted and suspended? To find answers, we conduct several experiments supported by statistical tests; including lexical analysis, spamming, negative language, hate speech, and misinformation spread.

show abstract

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

Cited by 32 publications

References 32 publications

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets

BlackLivesMatter 2020: An Analysis of Deleted and Suspended Users in Twitter

Contact Info

Product

Resources

About