A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

Mozafari, Marzieh; Farahbakhsh, Reza; Crespi, Noël

doi:10.1007/978-3-030-36687-2_77

Cited by 264 publications

(216 citation statements)

References 21 publications

Supporting

Mentioning

208

Contrasting

Unclassified

Order By: Relevance

“…To define automated methods with a promising performance for hate speech detection in social media, Natural Language Processing (NLP) has been used jointly with classic Machine Learning (ML) [2][3][4] and Deep Learning (DL) techniques [6,15,16]. The majority of contributions in classic supervised machine learning-based methods, for hate speech detection, rely on different text mining-based features or user-based and platform-based metadata [4,17,18], which require them to define an applicable feature extraction method and prevent them to generalize their approach to new datasets and platforms.…”

Section: Disclaimermentioning

confidence: 99%

“…Although some deep neural network models such as Convolutional Neural Networks (CNNs) [16], Long Short-Term Memory Networks (LSTMs) [6], etc., have been employed to enhance the performance of hate speech detection tools, the requirement of a sufficient amount of labeled data and the inability of methods to be generalized have remained as open challenges. To address these limitations some transfer learning methods are proposed recently [15,19]. However these methods enhanced the performance of hate speech detection models, they did not address the existing bias in data and algorithm.…”

Section: Disclaimermentioning

confidence: 99%

“…This study is an extended version of our previous work [15] at which we proposed a transfer learning approach for identification of hate speech in online social media by employing a combination of the unsupervised pre-trained model BERT [23] and new supervised fine-tuning strategies. Here, we investigate the effect of unintended bias in our pre-trained BERT-based model and use a generalization mechanism proposed by Schuster et.al [46], for debiasing fact verification models in training data by reweighting samples and then changing the fine-tuning strategies in terms of the loss function to mitigate the racial bias propagated through the model.…”

Section: Disclaimermentioning

confidence: 99%

“…• Following our previous study [15], we conduct a comprehensive experiment to inspect the impact of our transfer learning approach in a shortage of labeled data and in capturing syntactical and contextual information of all BERT transformers' embeddings.…”

Section: Disclaimermentioning

confidence: 99%

See 3 more Smart Citations

Hate speech detection and racial bias mitigation in social media based on BERT model

2020

Self Cite

View full text Add to dashboard Cite

Disparate biases associated with datasets and trained classifiers in hateful and abusive content identification tasks have raised many concerns recently. Although the problem of biased datasets on abusive language detection has been addressed more frequently, biases arising from trained classifiers have not yet been a matter of concern. In this paper, we first introduce a transfer learning approach for hate speech detection based on an existing pretrained language model called BERT (Bidirectional Encoder Representations from Transformers) and evaluate the proposed model on two publicly available datasets that have been annotated for racism, sexism, hate or offensive content on Twitter. Next, we introduce a bias alleviation mechanism to mitigate the effect of bias in training set during the fine-tuning of our pre-trained BERT-based model for hate speech detection. Toward that end, we use an existing regularization method to reweight input samples, thereby decreasing the effects of high correlated training set' s n-grams with class labels, and then fine-tune our pre-trained BERT-based model with the new re-weighted samples. To evaluate our bias alleviation mechanism, we employed a cross-domain approach in which we use the trained classifiers on the aforementioned datasets to predict the labels of two new datasets from Twitter, AAE-aligned and White-aligned groups, which indicate tweets written in African-American English (AAE) and Standard American English (SAE), respectively. The results show the existence of systematic racial bias in trained classifiers, as they tend to assign tweets written in AAE from AAE-aligned group to negative classes such as racism, sexism, hate, and offensive more often than tweets written in SAE from White-aligned group. However, the racial bias in our classifiers reduces significantly after our bias alleviation mechanism is incorporated. This work could institute the first step towards debiasing hate speech and abusive language detection systems.

show abstract

Section: Disclaimermentioning

confidence: 99%

Section: Disclaimermentioning

confidence: 99%

Section: Disclaimermentioning

confidence: 99%

Section: Disclaimermentioning

confidence: 99%

See 2 more Smart Citations

Hate speech detection and racial bias mitigation in social media based on BERT model

2020

Self Cite

View full text Add to dashboard Cite

show abstract

“…The deep learning methods can be roughly divided into two categories: one focuses on front-end processing which optimizes the word embedding technology, and the other on mid-end processing which usually uses simple word or character based embedding technology and pays more attention to the middle neural networks processing. The most famous methods focused on front-end processing are Embeddings from Language Models (ELMo) [6] [13], which trains word vectors with context, and Bidirectional Encoder Representation from Transformers (BERT) [14] [15]. BERT is the first deeply bidirectional, unsupervised language representation from unlabeled text by jointly conditioning on both left and right context in all layers.…”

Section: B State Of the Art In Deep Learningmentioning

confidence: 99%

Deep Learning Based Fusion Approach for Hate Speech Detection

et al. 2020

View full text Add to dashboard Cite

In recent years, the increasing prevalence of hate speech in social media has been considered as a serious problem worldwide. Many governments and organizations have made significant investment in hate speech detection techniques, which have also attracted the attention of the scientific community. Although plenty of literature focusing on this issue is available, it remains difficult to assess the performances of each proposed method, as each has its own advantages and disadvantages. A general way to improve the overall results of classification by fusing the various classifiers results is a meaningful attempt. We first focus on several famous machine learning methods for text classification such as Embeddings from Language Models (ELMo), Bidirectional Encoder Representation from Transformers (BERT) and Convolutional Neural Network (CNN), and apply these methods to the data sets of the SemEval 2019 Task 5. We then adopt some fusion strategies to combine the classifiers to improve the overall classification performance. The results show that the accuracy and F1-score of the classification are significantly improved.

show abstract

Hate speech detection in social media: Techniques, recent trends, and future challenges

Rawat,

Kumar,

Samant

2024

WIREs Computational Stats

View full text Add to dashboard Cite

The realm of Natural Language Processing and Text Mining has seen a surge in interest from researchers in hate speech detection, leading to an increase in related studies. This analysis aims to create a valuable resource by summarizing the methods and strategies used to combat hate speech in social media. We perform a detailed review to achieve a deep knowledge of the hate speech detection landscape from 2018 to 2023, revealing global incidents of hate speech in 2022–2023. Sixty‐six relevant articles were selected for this review. Existing studies were analyzed and categorized into five method categories: Machine Learning, Deep Learning, Ensemble models, Graph Neural Networks, and Graph Convolutional Networks. These advancements can aid social networking services in identifying hate messages before being posted, reducing the risk of harassment. The review also covers available hate speech datasets and highlights research challenges, but it is clear that a definitive solution to this problem is yet to be found. Future research directions are recommended to address the ongoing challenges in Hate Speech Detection.This article is categorized under: Applications of Computational Statistics > Computational Linguistics Statistical Learning and Exploratory Methods of the Data Sciences > Knowledge Discovery Statistical Learning and Exploratory Methods of the Data Sciences > Classification and Regression Trees (CART) Statistical Learning and Exploratory Methods of the Data Sciences > Text Mining

show abstract

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

Cited by 264 publications

References 21 publications

Hate speech detection and racial bias mitigation in social media based on BERT model

Hate speech detection and racial bias mitigation in social media based on BERT model

Deep Learning Based Fusion Approach for Hate Speech Detection

Hate speech detection in social media: Techniques, recent trends, and future challenges

Contact Info

Product

Resources

About