2020
DOI: 10.3390/app10238631
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Deep Learning Models and Various Text Pre-Processing Techniques for the Toxic Comments Classification

Abstract: The emergence of anti-social behaviour in online environments presents a serious issue in today’s society. Automatic detection and identification of such behaviour are becoming increasingly important. Modern machine learning and natural language processing methods can provide effective tools to detect different types of anti-social behaviour from the pieces of text. In this work, we present a comparison of various deep learning models used to identify the toxic comments in the Internet discussions. Our main go… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
25
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 66 publications
(27 citation statements)
references
References 42 publications
1
25
0
1
Order By: Relevance
“…The obstacle of the two studies lies in the small amount of data so that additional data is needed for further research development. On the other hand, the use of pre-trained models is used to solve other NLP problems such as text based emoticon classification [2] and toxic comment classification [30] using RoBERTa and XLNet shows improvement in accuracy when added with other NLP Features such as TF-IDF and sentiment analysis.…”
Section: Related Workmentioning
confidence: 99%
“…The obstacle of the two studies lies in the small amount of data so that additional data is needed for further research development. On the other hand, the use of pre-trained models is used to solve other NLP problems such as text based emoticon classification [2] and toxic comment classification [30] using RoBERTa and XLNet shows improvement in accuracy when added with other NLP Features such as TF-IDF and sentiment analysis.…”
Section: Related Workmentioning
confidence: 99%
“…Then, the performance calculations can be achieved with two operations, namely micro-averaging and macro-averaging. Because of the imbalance in the class dataset, we used micro-averaging in this study [36]. To plot the ROC curve, the required TPR and FPR are computed as follows: where i stands for each testing class and n will be 15 classes for this study.…”
Section: Discussionmentioning
confidence: 99%
“…Liu et al [23] proposed a model for multi-label text classification using ELMo and attention with GRU on Kaggle's toxic comment classification data. Krešnáková et al [24] carried out experimentation on the same Kaggle's dataset using different text pre-processing technique with DL models such as CNN, GRU, Bi-LSTM + CNN, Bi-GRU + CNN. Özel et al [26] implemented various ML models such as SVM, kNN etc.…”
Section: Related Studiesmentioning
confidence: 99%