Proceedings of the 13th International Workshop on Semantic Evaluation 2019
DOI: 10.18653/v1/s19-2082
|View full text |Cite
|
Sign up to set email alerts
|

STUFIIT at SemEval-2019 Task 5: Multilingual Hate Speech Detection on Twitter with MUSE and ELMo Embeddings

Abstract: We evaluate the viability of multilingual learning for the task of hate speech detection. We also experiment with adversarial learning as a means of creating a multilingual model. Ultimately our multilingual models have had worse results than their monolignual counterparts. We find that the choice of word representations (word embeddings) is very crucial for deep learning as a simple switch between MUSE and ELMo embeddings has shown a 3-4% increase in accuracy. This also shows the importance of context when de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 9 publications
0
15
0
Order By: Relevance
“…The deep learning methods can be roughly divided into two categories: one focuses on front-end processing which optimizes the word embedding technology, and the other on mid-end processing which usually uses simple word or character based embedding technology and pays more attention to the middle neural networks processing. The most famous methods focused on front-end processing are Embeddings from Language Models (ELMo) [6] [13], which trains word vectors with context, and Bidirectional Encoder Representation from Transformers (BERT) [14] [15]. BERT is the first deeply bidirectional, unsupervised language representation from unlabeled text by jointly conditioning on both left and right context in all layers.…”
Section: B State Of the Art In Deep Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…The deep learning methods can be roughly divided into two categories: one focuses on front-end processing which optimizes the word embedding technology, and the other on mid-end processing which usually uses simple word or character based embedding technology and pays more attention to the middle neural networks processing. The most famous methods focused on front-end processing are Embeddings from Language Models (ELMo) [6] [13], which trains word vectors with context, and Bidirectional Encoder Representation from Transformers (BERT) [14] [15]. BERT is the first deeply bidirectional, unsupervised language representation from unlabeled text by jointly conditioning on both left and right context in all layers.…”
Section: B State Of the Art In Deep Learningmentioning
confidence: 99%
“…Recently, machine learning approach which can learn the different associations between pieces of text, and that a particular output is expected for a particular input by using pre-labeled examples as training data is popular in scientific studies for hate speech detection. Among various machine learning methods, deep learning which is a subset of machine learning, is very prominent in Natural Language Processing (NLP) to tackle the issue of text classification [5] [6].…”
Section: Introductionmentioning
confidence: 99%
“…Global Vectors for word representation (GloVe) [6] and random embeddings as input to DNN classifiers has been compared in [7]. Recently, sentence embeddings [9] and Embeddings from Language Models (ELMo) [10] were used as input to classifiers for toxic comment classification. Multi-features based approach combining various lexicons and semantic-based features is presented in [11].…”
Section: She Looks Like a Plastic Monkey Doll!mentioning
confidence: 99%
“…To protect these users from online harassment, it is necessary to develop a tool that can automatically detect the toxic language in social media. In fact, many toxic language detection (TLD) systems have been proposed in these years based on different models, such as support vector machines (SVM) (Gaydhani et al, 2018), bi-directional long shortterm memory (BiLSTM) (Bojkovskỳ and Pikuliak, 2019), logistic regression and fine-tuning BERT (d'Sa et al, 2020).…”
Section: Introductionmentioning
confidence: 99%