2022
DOI: 10.1016/j.compeleceng.2022.108005
|View full text |Cite
|
Sign up to set email alerts
|

Cross-lingual offensive speech identification with transfer learning for low-resource languages

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 15 publications
0
1
0
Order By: Relevance
“…As a result, many studies employed Pre-trained multilingual word embeddings like FastText ( Bigoulaeva, Hangya & Fraser, 2021 ), MUSE ( Pamungkas & Patti, 2019 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; Bigoulaeva, Hangya & Fraser, 2021 ), or LASER ( Deshpande, Farris & Kumar, 2022 , Aluru et al, 2020 , Pelicon et al, 2021a ), and Vitiugin, Senarath & Purohit (2021) . Moreover, most of the research studies has focused on the use of pre-trained language models LLMs (basically as classifiers): BERT ( Vashistha & Zubiaga, 2021 , zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; Zia et al, 2022 ; Pamungkas, Basile & Patti, 2021a ), AraBERT (for Arabic data) ( zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ), CseBERT (for English, Croatian and Slovenian data) ( Pelicon et al, 2021b ), as well as multilingual BERT models: ( Shi et al, 2022 ; Bhatia et al, 2021 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; De la Peña Sarracén & Rosso, 2022 ; Tita & Zubiaga, 2021 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a ; Ghadery & Moens, 2020 ; Pelicon et al, 2021b ; Awal et al, 2024 ; Montariol, Riabi & Seddah, 2022 ; Ahn et al, 2020a ; Bigoulaeva et al, 2022 , 2023 ; Pamungkas, Basile & Patti, 2021a ; Pelicon et al, 2021a ), DistilmBERT model ( Vitiugin, Senarath & Purohit, 2021 ), and RoBERTa ( Zia et al, 2022 ).…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…As a result, many studies employed Pre-trained multilingual word embeddings like FastText ( Bigoulaeva, Hangya & Fraser, 2021 ), MUSE ( Pamungkas & Patti, 2019 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; Bigoulaeva, Hangya & Fraser, 2021 ), or LASER ( Deshpande, Farris & Kumar, 2022 , Aluru et al, 2020 , Pelicon et al, 2021a ), and Vitiugin, Senarath & Purohit (2021) . Moreover, most of the research studies has focused on the use of pre-trained language models LLMs (basically as classifiers): BERT ( Vashistha & Zubiaga, 2021 , zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; Zia et al, 2022 ; Pamungkas, Basile & Patti, 2021a ), AraBERT (for Arabic data) ( zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ), CseBERT (for English, Croatian and Slovenian data) ( Pelicon et al, 2021b ), as well as multilingual BERT models: ( Shi et al, 2022 ; Bhatia et al, 2021 ; Deshpande, Farris & Kumar, 2022 ; Aluru et al, 2020 ; zahra El-Alami, Ouatik El Alaoui & En Nahnahi, 2022 ; De la Peña Sarracén & Rosso, 2022 ; Tita & Zubiaga, 2021 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a ; Ghadery & Moens, 2020 ; Pelicon et al, 2021b ; Awal et al, 2024 ; Montariol, Riabi & Seddah, 2022 ; Ahn et al, 2020a ; Bigoulaeva et al, 2022 , 2023 ; Pamungkas, Basile & Patti, 2021a ; Pelicon et al, 2021a ), DistilmBERT model ( Vitiugin, Senarath & Purohit, 2021 ), and RoBERTa ( Zia et al, 2022 ).…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
“…This method utilizes a combination of transformers (mBERT and XLM-R) with convolutional neural layers, to encode the texts. Moreover, Shi et al (2022) , proposed an unsupervised model that employs cross-lingual mapping, sample generation, and transfer learning. Their model employs a novel training methodology that combines adversarial learning, transfer learning, and agreement regularization to detect offensive language in many low-resource languages.…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
“…For example, data augmentation has frequently been used to expand training data and enhance the generalization ability of models in diverse linguistic contexts [55,56]. To further bridge the gap between different languages, strategies such as adversarial training [57,58] and contrastive learning [54] have been used to improve the capability of the model for crosslingual processing. Among them, adversarial training introduces adversarial perturbations to enhance the robustness of the model, while contrastive learning utilizes contrastive loss to effectively discern the similarities and disparities across languages.…”
Section: Cross-lingual Tasksmentioning
confidence: 99%