2021
DOI: 10.48550/arxiv.2102.08886
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards generalisable hate speech detection: a review on obstacles and solutions

Abstract: Hate speech is one type of harmful online content which directly attacks or promotes hate towards a group or an individual member based on their actual or perceived aspects of identity, such as ethnicity, religion, and sexual orientation. With online hate speech on the rise, its automatic detection as a natural language processing task is gaining increasing interest. However, it is only recently that it has been shown that existing models generalise poorly to unseen data. This survey paper attempts to summaris… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 85 publications
(148 reference statements)
0
4
0
Order By: Relevance
“…If the goal is to create a model that performs as well as possible on one dataset, then the traditional approach is appropriate. On the other hand, if we want to create a model that will generalise across time and topic, we believe it would be sensible for researchers to introduce domain specific knowledge and to also use an alternative test-set, as has been done in other fields (Yin and Zubiaga, 2021). Whilst machine learning can deliver impressive results, there is value in understanding relevant theory, as shown in our results.…”
Section: Discussionmentioning
confidence: 78%
“…If the goal is to create a model that performs as well as possible on one dataset, then the traditional approach is appropriate. On the other hand, if we want to create a model that will generalise across time and topic, we believe it would be sensible for researchers to introduce domain specific knowledge and to also use an alternative test-set, as has been done in other fields (Yin and Zubiaga, 2021). Whilst machine learning can deliver impressive results, there is value in understanding relevant theory, as shown in our results.…”
Section: Discussionmentioning
confidence: 78%
“…Debias Hate Speech Detection Recent works (Yin and Zubiaga 2021;Wiegand et al 2019;Kennedy et al, 2020;Ma et al, 2020;Gehman et al, 2020;Dreier et al, 2022;Stanovsky et al, 2019;Thakur et al, 2023;Ziems et al, 2023) have been studying the generalizability and biases for hate speech detection (Talat et al, 2018;AlKhamissi et al, 2022;Röttger et al, 2022;Bianchi et al, 2022). For instance, prior work found that existing hate speech detection models are biased against African American Vernacular English Speakers (Harris et al, 2022b;Sap et al, 2019) and certain identity words are highly correlated with these hateful labels (Bender et al, 2021;ElSherief et al, 2021a).…”
Section: Related Workmentioning
confidence: 99%
“…In the context of automated detection studies, offensive and abusive language are both used as overarching words for harmful content. Offensive language has a broader reach, and hope speech falls under each of these categories (Hande et al, 2021b;Yin and Zubiaga, 2021). The strong relationship between hate speech and actual hate crimes highlight the significance of identifying and moderating hate speech.…”
Section: Related Workmentioning
confidence: 99%