2020
DOI: 10.1007/s10579-020-09509-1
|View full text |Cite
|
Sign up to set email alerts
|

Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity

Abstract: The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) ev… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
39
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(39 citation statements)
references
References 52 publications
0
39
0
Order By: Relevance
“…A handful of studies are contributed by scholars to develop resources and cyberbullying detection strategies in different languages worldwide. Most studies have hateful instances ranging from 2 to 5% (Emmery et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…A handful of studies are contributed by scholars to develop resources and cyberbullying detection strategies in different languages worldwide. Most studies have hateful instances ranging from 2 to 5% (Emmery et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…One of the major challenges is posed by the scarcity of the required resources typically for newly emerged languages. Moreover, most of the datasets used for cyberbullying detection, even in mature languages, exhibit an extreme skew between hate speech and non-hate speech textual contents (Emmery et al, 2020). This leads to formation of inappropriate strategies, unreliable predictive performance (specifically for the minority class) and more sensitivity towards classification errors.…”
Section: Introductionmentioning
confidence: 99%
“…Although, methods and tools continue to enhanced in cyberbullying detection , the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability [13].…”
Section: Cyberbullying Prevention Methods and Limitationsmentioning
confidence: 99%
“…They experimented with four neural networks on three cyberbullying datasets from different social network platforms. However, two noteworthy studies [30,31] discussed the limitations of the above two works [28,29] in data processing. In these two works, the oversampling method was handled for data processing, which led to overfitting of data, in other words, performance claims of the models in these two works had become overestimated.…”
Section: Cyberbullying Detectionmentioning
confidence: 99%