Proceedings of the 13th International Workshop on Semantic Evaluation 2019
DOI: 10.18653/v1/s19-2126
|View full text |Cite
|
Sign up to set email alerts
|

NLPR@SRPOL at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier

Abstract: The paper presents a system developed for the SemEval-2019 competition Task 5 hat-Eval Basile et al. (2019) (team name: LU Team) and Task 6 OffensEval Zampieri et al. (2019b) (team name: NLPR@SRPOL), where we achieved 2 nd position in Subtask C. The system combines in an ensemble several models (LSTM, Transformer, OpenAI's GPT, Random forest, SVM) with various embeddings (custom, ELMo, fastText, Universal Encoder) together with additional linguistic features (number of blacklisted words, special characters, e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 26 publications
0
9
0
Order By: Relevance
“…However, many teams used ensembles of deep learning models (Mahata et al, 2019) to benefit from its minimal need for features engineering and ability to boost the classifier performance. Moreover, to address the small dataset problem some teams used Bert model (Liu et al, 2019) and others utilized external datasets to further increase the training data (Seganti et al, 2019). In this paper, we show that using back translation data augmentation and transfer learning significantly improves the offensive language prediction performance.…”
Section: Related Workmentioning
confidence: 84%
“…However, many teams used ensembles of deep learning models (Mahata et al, 2019) to benefit from its minimal need for features engineering and ability to boost the classifier performance. Moreover, to address the small dataset problem some teams used Bert model (Liu et al, 2019) and others utilized external datasets to further increase the training data (Seganti et al, 2019). In this paper, we show that using back translation data augmentation and transfer learning significantly improves the offensive language prediction performance.…”
Section: Related Workmentioning
confidence: 84%
“…It is also possible to use an ensemble of the above methods [59,61]. In fact, such an approach has recently provided the best performance (based on the average performance on all subtasks) in a competition among more than fifty participating teams [77].…”
Section: Related Workmentioning
confidence: 99%
“…As the overall best performing model in task 6 of the 2019 Semeval competition was an ensemble of different models [77], we also experienced an ensemble solution. For this experiment, to still incorporate additional data in RoBERTa, we combined the RoBERTa model trained on HASOC data with the FastText model trained on all databases, using pretrained word embeddings.…”
Section: Ensemble Of Classifiersmentioning
confidence: 99%
“…The systems developed for these tasks used cutting-edge NLP, machine learning and deep learning techniques. Some key systems for OffensEval and HatEval such as Fermi (Indurthi et al, 2019) used Universal Sentence Encoder to build a SVM model, NLPR@SAPOL (Seganti et al, 2019) used an ensemble of deep learning models like OpenAI Finetune and Transformer, while NULI developed a BERT-based model. Although some systems have achieved reasonable performance (e.g., 0.82 F1-score for subtask A of OffensEval by NULI), most other systems still lack 'good' performance for other subtasks such as identifying target and type of offenses.…”
Section: Related Workmentioning
confidence: 99%