2019
DOI: 10.18517/ijaseit.9.4.8123
|View full text |Cite
|
Sign up to set email alerts
|

Translated vs Non-Translated Method for Multilingual Hate Speech Identification in Twitter

Abstract: Nowadays social media is often misused to spread hate speech. Spreading hate speech is an act that needs to be handled in a special way because it can undermine or discriminate other people and cause conflict that leading to both material and immaterial losses. There are several challenges in building a hate speech identification system; one of them is identifying hate speech in multilingual scope. In this paper, we adapt and compare two methods in multilingual text classification which are translated (with an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(14 citation statements)
references
References 17 publications
0
14
0
Order By: Relevance
“…Based on Table 4, most studies implemented transformerbased architecture to deal with abusive language detection in a cross-lingual setting. However, we also observe some works that exploited a traditional machine learning approach, such as logistic regression [6,10,135], linear support vector machines [92,94], and support vector machines [59]. They used multilingual language representation or simple translation tools (to translate the data training to the target languages) for the knowledge sharing between languages.…”
Section: Modelsmentioning
confidence: 98%
See 2 more Smart Citations
“…Based on Table 4, most studies implemented transformerbased architecture to deal with abusive language detection in a cross-lingual setting. However, we also observe some works that exploited a traditional machine learning approach, such as logistic regression [6,10,135], linear support vector machines [92,94], and support vector machines [59]. They used multilingual language representation or simple translation tools (to translate the data training to the target languages) for the knowledge sharing between languages.…”
Section: Modelsmentioning
confidence: 98%
“…[120] Traditional models Experimented with the use of machine translation tools to translate the training data to the target language and exploited a wide range of traditional models including SVM, naïve Bayes, and random forest. [59] Neural based Proposed a joint-learning architecture based on LSTM coupled with features from HurtLex to transfer knowledge between domains and languages. [94] Transformer based Proposed multichannel architecture based on BERT model, which learns the task sequentially in three languages: source languages, English, and Chinese.…”
Section: What Has Been Done So Far In Multilingual Abusive Language Detection Study?mentioning
confidence: 99%
See 1 more Smart Citation
“…Ousidhoum et al [46] presented the first multilingual multi-aspect hate speech analysis dataset in English, French, and Arabic tweets and evaluated several multilingual multi-task learning approaches for the identification of hate in a multilingual setting. Ibrohim and Budi [54] investigated the effect of the machine translation approach in multilingual hate speech detection in Hindi, English, and Indonesian, by comparing classifiers trained with/without translating samples. Ranasinghe and Zampieri [44] employed a cross-lingual contextual word embeddings model, XLM-R, to transfer knowledge from a rich-resourced language, English, to a lower-resource language (i.e., Bengali, Hindi, or Spanish) to predict offensive content in less-resourced languages.…”
Section: ) Multilingualmentioning
confidence: 99%
“…Naïve Bayes algorithm is a classification algorithm that is quite good and is often used in various studies [11]- [13]. This algorithm can be used for simple classification with fixed Y variable and also for text classification [14]- [16]. Laga and Sarno [17] showed that Naïve Bayes gave the best accuracy from other classification methods, such as KNN, SVM, and random forest.…”
Section: Introductionmentioning
confidence: 99%