Exploring Automatic Hate Speech Detection on Social Media: A Focus on Content-Based Analysis

Nascimento, Francimaria R. S.; Cavalcanti, George D. C.; Abreu, Márjory Da Costa

doi:10.1177/21582440231181311

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

2024

Publication Types

Select...

Article3

Book1

Other1

Relationship

Self Cite0

Independent5

Authors

Journals

Cited by 5 publications

References 93 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Recognizing Hate Speech on Twitter with Feature Combo

Saini,

Vaidya

2024

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Recognizing Hate Speech on Twitter with Feature Combo

Saini,

Vaidya

2024

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Gender bias detection on hate speech classification: an analysis at feature-level

Nascimento,

Cavalcanti,

Costa-Abreu

2024

Neural Comput & Applic

View full text Add to dashboard Cite

Hate speech is a growing problem on social media due to the larger volume of content being shared. Recent works demonstrated the usefulness of distinct machine learning algorithms combined with natural language processing techniques to detect hateful content. However, when not constructed with the necessary care, learning models can magnify discriminatory behaviour and lead the model to incorrectly associate comments with specific identity terms (e.g., woman, black, and gay) with a particular class, such as hate speech. Moreover, some specific characteristics should be considered in the test set when evaluating the presence of bias, considering that the test set can follow the same biased distribution of the training set and compromise the results obtained by the bias metrics. This work argues that considering the potential bias in hate speech detection is needed and focuses on developing an intelligent system to address these limitations. Firstly, we proposed a comprehensive, unbiased dataset to unintended gender bias evaluation. Secondly, we propose a framework to help analyse bias from feature extraction techniques. Then, we evaluate several state-of-the-art feature extraction techniques, specifically focusing on the bias towards identity terms. We consider six feature extraction techniques, including TF, TF-IDF, FastText, GloVe, BERT, and RoBERTa, and six classifiers, LR, DT, SVM, XGB, MLP, and RF. The experimental study across hate speech datasets and a range of classification and unintended bias metrics demonstrates that the choice of the feature extraction technique can impact the bias on predictions, and its effectiveness can depend on the dataset analysed. For instance, combining TF and TF-IDF with DT and MLP resulted in higher bias, while BERT and RoBERTa showed lower bias with the same classifier for the HE and WH datasets. The proposed dataset and source code will be publicly available when the paper is published.

show abstract