2022
DOI: 10.1016/j.ipm.2022.102981
|View full text |Cite
|
Sign up to set email alerts
|

Transfer language selection for zero-shot cross-lingual abusive language detection

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 21 publications
(11 citation statements)
references
References 50 publications
0
11
0
Order By: Relevance
“…As a result, many studies employed Pre-trained multilingual word embeddings like FastText ( Bigoulaeva, Hangya & Fraser, 2021 ), MUSE ( Pamungkas & Patti, 2019 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; Bigoulaeva, Hangya & Fraser, 2021 ), or LASER ( Deshpande, Farris & Kumar, 2022 , Aluru et al, 2020 , Pelicon et al, 2021a ), and Vitiugin, Senarath & Purohit (2021) . Moreover, most of the research studies has focused on the use of pre-trained language models LLMs (basically as classifiers): BERT ( Vashistha & Zubiaga, 2021 , zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; Zia et al, 2022 ; Pamungkas, Basile & Patti, 2021a ), AraBERT (for Arabic data) ( zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ), CseBERT (for English, Croatian and Slovenian data) ( Pelicon et al, 2021b ), as well as multilingual BERT models: ( Shi et al, 2022 ; Bhatia et al, 2021 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; De la Peña Sarracén & Rosso, 2022 ; Tita & Zubiaga, 2021 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a ; Ghadery & Moens, 2020 ; Pelicon et al, 2021b ; Awal et al, 2024 ; Montariol, Riabi & Seddah, 2022 ; Ahn et al, 2020a ; Bigoulaeva et al, 2022 , 2023 ; Pamungkas, Basile & Patti, 2021a ; Pelicon et al, 2021a ), DistilmBERT model ( Vitiugin, Senarath & Purohit, 2021 ), and RoBERTa ( Zia et al, 2022 ).…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…As a result, many studies employed Pre-trained multilingual word embeddings like FastText ( Bigoulaeva, Hangya & Fraser, 2021 ), MUSE ( Pamungkas & Patti, 2019 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; Bigoulaeva, Hangya & Fraser, 2021 ), or LASER ( Deshpande, Farris & Kumar, 2022 , Aluru et al, 2020 , Pelicon et al, 2021a ), and Vitiugin, Senarath & Purohit (2021) . Moreover, most of the research studies has focused on the use of pre-trained language models LLMs (basically as classifiers): BERT ( Vashistha & Zubiaga, 2021 , zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; Zia et al, 2022 ; Pamungkas, Basile & Patti, 2021a ), AraBERT (for Arabic data) ( zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ), CseBERT (for English, Croatian and Slovenian data) ( Pelicon et al, 2021b ), as well as multilingual BERT models: ( Shi et al, 2022 ; Bhatia et al, 2021 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; De la Peña Sarracén & Rosso, 2022 ; Tita & Zubiaga, 2021 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a ; Ghadery & Moens, 2020 ; Pelicon et al, 2021b ; Awal et al, 2024 ; Montariol, Riabi & Seddah, 2022 ; Ahn et al, 2020a ; Bigoulaeva et al, 2022 , 2023 ; Pamungkas, Basile & Patti, 2021a ; Pelicon et al, 2021a ), DistilmBERT model ( Vitiugin, Senarath & Purohit, 2021 ), and RoBERTa ( Zia et al, 2022 ).…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
“…On the other hand, cross-lingual language models like XLM were also widely employed, where we found implementation of XLM-RoBERTa (XLM-R) ( Roy et al, 2021a ; Bhatia et al, 2021 ; Wang et al, 2020 ; De la Peña Sarracén & Rosso, 2022 ; Zia et al, 2022 ; Tita & Zubiaga, 2021 ; Ranasinghe & Zampieri, 2021b ; Dadu & Pant, 2020 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a , 2020 ; Mozafari, Farahbakhsh & Crespi, 2022 ; Barbieri, Espinosa Anke & Camacho-Collados, 2022 ; Awal et al, 2024 ; Stappen, Brunn & Schuller, 2020 ), and both XLM-R and XLM-T ( Montariol, Riabi & Seddah, 2022 , Riabi, Montariol & Seddah, 2022 ). These approaches have all been shown to improve performance on tasks involving multilingual/cross-lingual hate speech detection because they are more likely able to capture semantic and syntactic features across languages thanks to their pre-training on multilingual large volumes of texts.…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
“…Cyberbullying detection has gained considerable attention in recent years owing to the widespread use of social media platforms and online communication channels [7]. Researchers have explored various techniques and methodologies to identify and address cyberbullying among various languages and cultures [7,8,[31][32][33][34][35]. However, while numerous research efforts have introduced solutions to detect cyberbullying in high-resource languages such as English or Japanese, there is a limited number of studies that have extensively addressed cyberbullying detection in the low-resource languages, such as the Bangla language.…”
Section: Related Workmentioning
confidence: 99%
“…Through exposure to a variety of linguistic data, Multilingual BERT has the ability to process and convey the subtleties of various languages, including Bangla. Because of its strong architecture, it can handle a wide range of natural language processing tasks, such as the detection of cyberbullying [35], which makes it an invaluable tool for multilingual text analysis.…”
Section: Multilingual Bertmentioning
confidence: 99%
“…Cross-Lingual Abusive Language Detection. In recent years, crosslingual abusive language detection has gained increasing attention in zeroshot (Eronen et al, 2022) and few-shot (Mozafari et al, 2022) transfer.…”
Section: Background and Related Workmentioning
confidence: 99%