Proceedings of the Fourteenth Workshop on Semantic Evaluation 2020
DOI: 10.18653/v1/2020.semeval-1.211
|View full text |Cite
|
Sign up to set email alerts
|

SINAI at SemEval-2020 Task 12: Offensive Language Identification Exploring Transfer Learning Models

Abstract: This paper describes the participation of SINAI team at Task 12: OffensEval 2: Multilingual Offensive Language Identification in Social Media. In particular, the participation in Sub-task A in English which consists of identifying tweets as offensive or not offensive. We preprocess the dataset according to the language characteristics used on social media. Then, we select a small set from the training set provided by the organizers and fine-tune different Transformerbased models in order to test their effectiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…There are several taxonomies to define discrete emotions. The SemEval 2018 Task 1 dataset contains 10,000 annotated English tweets with 12 emotion classes ( Mohammad et al, 2018 ), the EmoVent dataset contains 7,303 English tweets with 8 emotion classes ( Plaza del Arco et al, 2020 ), and the GoEmotions dataset contains 58,000 annotated English reddit comments with 27 emotion classes ( Demszky et al, 2020 ). They all suggest their respective mapping to the taxonomy with seven emotion classes proposed by Ekman (1992) or group them into positive, negative, and ambiguous or neutral classes.…”
Section: Methodsmentioning
confidence: 99%
“…There are several taxonomies to define discrete emotions. The SemEval 2018 Task 1 dataset contains 10,000 annotated English tweets with 12 emotion classes ( Mohammad et al, 2018 ), the EmoVent dataset contains 7,303 English tweets with 8 emotion classes ( Plaza del Arco et al, 2020 ), and the GoEmotions dataset contains 58,000 annotated English reddit comments with 27 emotion classes ( Demszky et al, 2020 ). They all suggest their respective mapping to the taxonomy with seven emotion classes proposed by Ekman (1992) or group them into positive, negative, and ambiguous or neutral classes.…”
Section: Methodsmentioning
confidence: 99%
“…All of the instances were based on Twitter. Most of the top systems used some pre-trained transformer-based models such as multilingual BERT, BETO and XLM-RoBERTa (Plaza-del Arco, Molina-González, and Alfonso 2021).…”
Section: Related Benchmark Competitionsmentioning
confidence: 99%
“…The best-performing teams for both sub-tasks used BETO (the Spanish version of BERT model) (Plaza-del Arco et al. 2021).…”
Section: Related Benchmark Competitionsmentioning
confidence: 99%