Fight Fire with Fire: Fine-tuning Hate Detectors using Large Samples of Generated Hate Speech

Wullach, Tomer; Adler, Amir; Minkov, Einat

doi:10.48550/arxiv.2109.00591

Cited by 1 publication

(1 citation statement)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, Transformer-based architectures (Mozafari et al, 2019;Aluru et al, 2020;Samghabadi et al, 2020;Salminen et al, 2020;Qian et al, 2021;Kennedy et al, 2020;Arviv et al, 2021) achieved significant improvements over RNN and CNN models (Zhang et al, 2016;Gambäck and Sikdar, 2017;Del Vigna12 et al, 2017;Park and Fung, 2017). In an effort to mitigate the need for extensive annotation some works use transformers to generate more samples, e.g., (Vidgen et al, 2020b;Wullach et al, 2020Wullach et al, , 2021. Zhou et al (2021) integrate features from external resources to support the model performance.…”

Section: Related Workmentioning

confidence: 99%

Free speech or Free Hate Speech? Analyzing the Proliferation of Hate Speech in Parler

Israeli¹,

Tsur²

2022

Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

View full text Add to dashboard Cite

Social platforms such as Gab and Parler, branded as 'free-speech' networks, have seen a significant growth of their user base in recent years. This popularity is mainly attributed to the stricter moderation enforced by mainstream platforms such as Twitter, Facebook, and Reddit. In this work we provide the first large scale analysis of hate-speech on Parler. We experiment with an array of algorithms for hate-speech detection, demonstrating the limitations of transfer learning in that domain, given the illusive and ever changing nature of the ways hate-speech is delivered. In order to improve classification accuracy we annotated 10K Parler posts, which we use to fine-tune a BERT classifier. Classification of individual posts is then leveraged for the classification of millions of users via label propagation over the social network. Classifying users by their propensity to disseminate hate, we find that hate mongers make about 16% of Parler active users, and that they have distinct characteristics comparing to other user groups. We find that hate mongers are more active, more central, express distinct levels of sentiment, and convey a distinct array of emotions like anger and sadness. We further complement our analysis by comparing the trends observed in Parler to those found in Gab. To the best of our knowledge, this is among the first works to analyze hate speech in Parler in a quantitative manner and on the user level.

show abstract