2017
DOI: 10.48550/arxiv.1703.04009
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Automated Hate Speech Detection and the Problem of Offensive Language

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
103
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 55 publications
(106 citation statements)
references
References 0 publications
2
103
0
1
Order By: Relevance
“…Hate speech, defined as speech that targets social groups with the intent to cause harm, is arguably the most widely studied form of incivility detection, largely due to the practical need to moderate online discussions. Many Twitter datasets have been collected, of racist and sexist tweets (Waseem and Hovy, 2016), of hateful and offensive tweets 2 Available at https://github.com/ anushreehede/incivility_in_news (Davidson et al, 2017), and of hateful, abusive, and spam tweets (Founta et al, 2018). Another category of incivility detection that more closely aligns with our work is toxicity prediction.…”
Section: Datasets For Incivility Detectionsupporting
confidence: 67%
“…Hate speech, defined as speech that targets social groups with the intent to cause harm, is arguably the most widely studied form of incivility detection, largely due to the practical need to moderate online discussions. Many Twitter datasets have been collected, of racist and sexist tweets (Waseem and Hovy, 2016), of hateful and offensive tweets 2 Available at https://github.com/ anushreehede/incivility_in_news (Davidson et al, 2017), and of hateful, abusive, and spam tweets (Founta et al, 2018). Another category of incivility detection that more closely aligns with our work is toxicity prediction.…”
Section: Datasets For Incivility Detectionsupporting
confidence: 67%
“…We are most interested in responses generated by dialogue models in offensive contexts. However, offensive language is rare in a random sample (Davidson et al, 2017;Founta et al, 2018). Hence, we implement a two-stage sampling strategy: (1) Random sample -From both sources, randomly sample 500 threads (total 1000).…”
Section: Data Collectionmentioning
confidence: 99%
“…Identifying Toxicity -Most work on identifying toxic language looked at a individual social media posts or comments without taking context into account (Davidson et al, 2017;Xu et al, 2012;Zampieri et al, 2019;Rosenthal et al, 2020;Kumar et al, 2018;Garibo i Orts, 2019;Ousidhoum et al, 2019;Breitfeller et al, 2019;Hada et al, 2021;Barikeri et al, 2021)…”
Section: Related Workmentioning
confidence: 99%
“…Detecting offensive and abusive content online is a critical step in mitigating the harms it causes to people (Waseem and Hovy, 2016;Davidson et al, 2017). Various online platforms have increasingly turned to NLP techniques to do this task at scale (e.g., the Perspective API).…”
Section: Introductionmentioning
confidence: 99%