Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021) 2021
DOI: 10.18653/v1/2021.woah-1.3
|View full text |Cite
|
Sign up to set email alerts
|

HateBERT: Retraining BERT for Abusive Language Detection in English

Abstract: We introduce HateBERT, a re-trained BERT model for abusive language detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit comments in English from communities banned for being offensive, abusive, or hateful that we have curated and made available to the public. We present the results of a detailed comparison between a general pre-trained language model and the retrained version on three English datasets for offensive, abusive language and hate speech detection tasks. In all data… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
112
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 152 publications
(112 citation statements)
references
References 31 publications
0
112
0
Order By: Relevance
“…Our results are favorable, ranging from 0.68-0.80 in macro-F1. Our results also outperform a variant of BERT that has been pretrained using hateful texts (Caselli et al, 2020): we achieved 0.61 in hate-F1 using the generic BERT finetuned on the original SE dataset vs. their 0.65, and improved this result to 0.77 with data augmentation. Compared with the CNN-GRU results (Zhang et al, 2018) reported in Wullach et al (2021), we obtain better results both prior and post augmentation in most cases.…”
Section: Cross-dataset Resultsmentioning
confidence: 73%
“…Our results are favorable, ranging from 0.68-0.80 in macro-F1. Our results also outperform a variant of BERT that has been pretrained using hateful texts (Caselli et al, 2020): we achieved 0.61 in hate-F1 using the generic BERT finetuned on the original SE dataset vs. their 0.65, and improved this result to 0.77 with data augmentation. Compared with the CNN-GRU results (Zhang et al, 2018) reported in Wullach et al (2021), we obtain better results both prior and post augmentation in most cases.…”
Section: Cross-dataset Resultsmentioning
confidence: 73%
“…We show that training on labelled in-domain data leads to better performance than similarly sized out-of-domain datasets, confirming the differences between the domains and highlighting the need for conversational data. While performance using general domain pretrained models leaves room for improvement, in future work, we hope to experiment with different initialisation settings, using models trained on data and tasks more similar to those of ConvAbuse, such as HateBERT (Caselli et al, 2021) or HurtBERT (Koufakou et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…For all BERT models, we use the pre-trained bertlarge-cased, for BART we use bart-large. 8 We also report experiments where we use a fine-tuned toxicity version of pre-trained BERT (Caselli et al, 2021, GroNLP/hateBERT) for f cnd in Dropout BERT (here referred to as Hate BERT, and when using a finetuned substitute classifier: Hate BERT+). The idea here is similar to that of using Dropout BERT+; domainspecific vocabularies will likely result in better and more varied substitutions.…”
Section: Augmentation Modelsmentioning
confidence: 99%
“…Our work combines multiple sizeable-to the extent that they respectively produced several surveys (Fortuna and Nunes, 2018;Gunasekara and Nejadgholi, 2018;Mishra et al, 2019;Banko et al, 2020;Madukwe et al, 2020;Muneer and Fati, 2020;Salawu et al, 2020;Jahan and Oussalah, 2021;Mladenovic et al, 2021) Recent cyberbullying work (Reynolds et al, 2011;Xu et al, 2012;Nitta et al, 2013;Bretschneider et al, 2014;Dadvar et al, 2014;Van Hee et al, 2015, e.g., are seminal work) has primarily focused on deploying Transformer-based models (Vaswani et al, 2017); by and large fine-tuning (Swamy et al, 2019;Paul and Saha, 2020;Gencoglu, 2021, e.g. ), or re-training (Caselli et al, 2020) BERT. It is worth noting that Elsafoury et al (2021a;Elsafoury et al (2021b) show that although fine-tuning BERT achieves state-of-theart performance in classification, its attention scores do not correlate with cyberbullying features, and they expect generalization of such models to be subpar.…”
Section: Related Workmentioning
confidence: 99%