Proceedings of the 2nd Workshop on Abusive Language Online (ALW2) 2018
DOI: 10.18653/v1/w18-5113
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Studies of Detecting Abusive Language on Twitter

Abstract: The context-dependent nature of online aggression makes annotating large collections of data extremely difficult. Previously studied datasets in abusive language detection have been insufficient in size to efficiently train deep learning models. Recently, Hate and Abusive Speech on Twitter, a dataset much greater in size and reliability, has been released. However, this dataset has not been comprehensively studied to its potential. In this paper, we conduct the first comparative study of various learning model… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
49
0
2

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 60 publications
(52 citation statements)
references
References 21 publications
1
49
0
2
Order By: Relevance
“…Many definitions pre-empt the effects of abusive language. For instance, Lee et al describe abusive language as 'any type of insult, vulgarity, or profanity that debases the target; it also can be anything that causes aggravation' (Lee, Yoon, & Jung, 2018). Similarly, in Wulczyn et al's dataset of over 100,000 Wikipedia comments, 'toxicity' is defined in relation to how likely it is to make individuals leave a discussion (Wulczyn et al, 2017).…”
Section: Categorizing Abusive Contentmentioning
confidence: 99%
See 1 more Smart Citation
“…Many definitions pre-empt the effects of abusive language. For instance, Lee et al describe abusive language as 'any type of insult, vulgarity, or profanity that debases the target; it also can be anything that causes aggravation' (Lee, Yoon, & Jung, 2018). Similarly, in Wulczyn et al's dataset of over 100,000 Wikipedia comments, 'toxicity' is defined in relation to how likely it is to make individuals leave a discussion (Wulczyn et al, 2017).…”
Section: Categorizing Abusive Contentmentioning
confidence: 99%
“…This can lead to considerable degradations in the quality of datasets over time. For instance, Founta et al shared a dataset of 80,000 tweets but soon after this was reduced to 70,000 (Founta et al, 2018;Lee et al, 2018). This not only decreases the quantity of data, reducing variety, but also the class distribution changes.…”
Section: Creating and Sharing Datasetsmentioning
confidence: 99%
“…Nevertheless, the it remained not updated at the moment of the submission of this paper. Another work could also not replicate this results (Lee et al, 2018). After circumventing some of the aforementioned problems of the original code, we explained our specific version and used it to enter the shared task.…”
Section: Resultsmentioning
confidence: 99%
“…Zraven naštetim učnim algoritmomčlanka [5] v poglavju 2 smoše implementirali nevronsko mrežo MLP in tradicionalni model "Bagging"klasifikator. Nevronsko mrežo smo izbrali zaradi dobrih rezultatov v sorodnihčlankih.…”
Section: Implementiranje Dodatnih Učnih Algoritmovunclassified
“…Prav tako je do razlik v rezultatih enakih testiranih učnih metod kot včlanku [5], prišlo zaradi tega, ker nismo mogli pridobiti vseh tvitov iz baze, ker so upravljalci Twitterja med temže izbrisali nekatere zlonamerne in sovražne tvite.…”
Section: Zaključekunclassified