2020
DOI: 10.1007/978-3-030-61255-9_20
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Online Hate Speech: Approaches Using Weak Supervision and Network Embedding Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…With the development of large pre-trained transformer models such as BERT and XLNET (Devlin et al, 2019;Yang et al, 2019), several studies have explored the use of general pre-trained transformers in offensive language identification (Liu et al, 2019;Bucur et al, 2021) as well retrained or fine-tuned models on offensive language corpora such as HateBERT (Caselli et al, 2020). While the vast majority of studies address offensive language identification using English data (Yao et al, 2019;Ridenhour et al, 2020), several recent studies have created new datasets for various languages and applied computational models to identify such content in Arabic (Mubarak et al, 2021), Dutch (Tulkens et al, 2016), French (Chiril et al, 2019), German (Wiegand et al, 2018), Greek (Pitenis et al, 2020), Hindi (Bohra et al, 2018), Italian (Poletto et al, 2017), Portuguese (Fortuna et al, 2019), Slovene (Fišer et al, 2017), Spanish (Plazadel Arco et al, 2021), and Turkish (C ¸öltekin, 2020. A recent trend is the use of pre-trained multilingual models such as XLM-R (Conneau et al, 2019) to leverage available English resources to make predictions in languages with less resources (Plaza-del Arco et al, 2021;Zampieri, 2020, 2021c,b;Sai and Sharma, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…With the development of large pre-trained transformer models such as BERT and XLNET (Devlin et al, 2019;Yang et al, 2019), several studies have explored the use of general pre-trained transformers in offensive language identification (Liu et al, 2019;Bucur et al, 2021) as well retrained or fine-tuned models on offensive language corpora such as HateBERT (Caselli et al, 2020). While the vast majority of studies address offensive language identification using English data (Yao et al, 2019;Ridenhour et al, 2020), several recent studies have created new datasets for various languages and applied computational models to identify such content in Arabic (Mubarak et al, 2021), Dutch (Tulkens et al, 2016), French (Chiril et al, 2019), German (Wiegand et al, 2018), Greek (Pitenis et al, 2020), Hindi (Bohra et al, 2018), Italian (Poletto et al, 2017), Portuguese (Fortuna et al, 2019), Slovene (Fišer et al, 2017), Spanish (Plazadel Arco et al, 2021), and Turkish (C ¸öltekin, 2020. A recent trend is the use of pre-trained multilingual models such as XLM-R (Conneau et al, 2019) to leverage available English resources to make predictions in languages with less resources (Plaza-del Arco et al, 2021;Zampieri, 2020, 2021c,b;Sai and Sharma, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…The clear majority of studies on this topic deal with English (Malmasi and Zampieri, 2017;Yao et al, 2019;Ridenhour et al, 2020) partially motivated by the availability English resources (e.g. corpora, lexicon, and pre-trained models).…”
Section: Related Workmentioning
confidence: 99%
“…2017; Yao et al, 2019;Ridenhour et al, 2020;Rosenthal et al, 2020) due to the the wide availability of language resources such as corpora and pre-trained models. In recent years, several studies have been published on identifying offensive content in other languages such as Arabic (Mubarak et al, 2020), Dutch (Tulkens et al, 2016), French (Chiril et al, 2019), Greek (Pitenis et al, 2020), Italian (Poletto et al, 2017), Portuguese (Fortuna et al, 2019), and Turkish (Çöltekin, 2020).…”
Section: Postmentioning
confidence: 99%