Enhancing Multilingual Hate Speech Detection: From Language-Specific Insights to Cross-Linguistic Integration

Hashmi, Ehtesham; Yildirim Yayilgan, Sule; Hameed, Ibrahim A.; Mudassar Yamin, Muhammad; Ullah, Mohib; Abomhara, Mohamed

doi:10.1109/access.2024.3452987

IEEE Access

2024

DOI: 10.1109/access.2024.3452987

|View full text |Cite

Enhancing Multilingual Hate Speech Detection: From Language-Specific Insights to Cross-Linguistic Integration

Ehtesham Hashmi,

Sule Yildirim Yayilgan,

Ibrahim A. Hameed

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

2025

Publication Types

Select...

Article3

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 70 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers

Hashmi,

Yayilgan,

Yamin

et al. 2024

Complex Intell. Syst.

View full text Add to dashboard Cite

Gendered disinformation undermines women’s rights, democratic principles, and national security by worsening societal divisions through authoritarian regimes’ intentional weaponization of social media. Online misogyny represents a harmful societal issue, threatening to transform digital platforms into environments that are hostile and inhospitable to women. Despite the severity of this issue, efforts to persuade digital platforms to strengthen their protections against gendered disinformation are frequently ignored, highlighting the difficult task of countering online misogyny in the face of commercial interests. This growing concern underscores the need for effective measures to create safer online spaces, where respect and equality prevail, ensuring that women can participate fully and freely without the fear of harassment or discrimination. This study addresses the challenge of detecting misogynous content in bilingual (English and Italian) online communications. Utilizing FastText word embeddings and explainable artificial intelligence techniques, we introduce a model that enhances both the interpretability and accuracy in detecting misogynistic language. To conduct an in-depth analysis, we implemented a range of experiments encompassing classic machine learning methodologies and conventional deep learning approaches to the recent transformer-based models incorporating both language-specific and multilingual capabilities. This paper enhances the methodologies for detecting misogyny by incorporating incremental learning for cutting-edge datasets containing tweets and posts from different sources like Facebook, Twitter, and Reddit, with our proposed approach outperforming these datasets in metrics such as accuracy, F1-score, precision, and recall. This process involved refining hyperparameters, employing optimization techniques, and utilizing generative configurations. By implementing Local Interpretable Model-agnostic Explanations (LIME), we further elucidate the rationale behind the model’s predictions, enhancing understanding of its decision-making process.

show abstract

Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers

Hashmi,

Yayilgan,

Yamin

et al. 2024

Complex Intell. Syst.

View full text Add to dashboard Cite

show abstract

Self-supervised hate speech detection in Norwegian texts with lexical and semantic augmentations

Hashmi,

Yayilgan,

Yamin

et al. 2025

Expert Systems with Applications

View full text Add to dashboard Cite

Roman urdu hate speech detection using hybrid machine learning models and hyperparameter optimization

Ashiq,

Kanwal,

Rafique

et al. 2024

Sci Rep

View full text Add to dashboard Cite

With the rapid increase of users over social media, cyberbullying, and hate speech problems have arisen over the past years. Automatic hate speech detection (HSD) from text is an emerging research problem in natural language processing (NLP). Researchers developed various approaches to solve the automatic hate speech detection problem using different corpora in various languages, however, research on the Urdu language is rather scarce. This study aims to address the HSD task on Twitter using Roman Urdu text. The contribution of this research is the development of a hybrid model for Roman Urdu HSD, which has not been previously explored. The novel hybrid model integrates deep learning (DL) and transformer models for automatic feature extraction, combined with machine learning algorithms (MLAs) for classification. To further enhance model performance, we employ several hyperparameter optimization (HPO) techniques, including Grid Search (GS), Randomized Search (RS), and Bayesian Optimization with Gaussian Processes (BOGP). Evaluation is carried out on two publicly available benchmarks Roman Urdu corpora comprising HS-RU-20 corpus and RUHSOLD hate speech corpus. Results demonstrate that the Multilingual BERT (MBERT) feature learner, paired with a Support Vector Machine (SVM) classifier and optimized using RS, achieves state-of-the-art performance. On the HS-RU-20 corpus, this model attained an accuracy of 0.93 and an F1 score of 0.95 for the Neutral-Hostile classification task, and an accuracy of 0.89 with an F1 score of 0.88 for the Hate Speech-Offensive task. On the RUHSOLD corpus, the same model achieved an accuracy of 0.95 and an F1 score of 0.94 for the Coarse-grained task, alongside an accuracy of 0.87 and an F1 score of 0.84 for the Fine-grained task. These results demonstrate the effectiveness of our hybrid approach for Roman Urdu hate speech detection.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Enhancing Multilingual Hate Speech Detection: From Language-Specific Insights to Cross-Linguistic Integration

Cited by 3 publications

References 70 publications

Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers

Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers

Self-supervised hate speech detection in Norwegian texts with lexical and semantic augmentations

Roman urdu hate speech detection using hybrid machine learning models and hyperparameter optimization

Contact Info

Product

Resources

About