Proceedings of the 13th International Workshop on Semantic Evaluation 2019
DOI: 10.18653/v1/s19-2119
|View full text |Cite
|
Sign up to set email alerts
|

KMI-Coling at SemEval-2019 Task 6: Exploring N-grams for Offensive Language detection

Abstract: In this paper, we present the system description of offensive language detection tool which is developed by the KMI−Coling Group under the OffensEval Shared task. The OffensEval Shared Task was conducted in SemEval 2019 workshop. To develop the system, we have explored n-grams up to 8-gram and trained three different systems namely A, B and C system for three different sub tasks within the OffensEval task which achieves the accuracy of 79.76%, 87.91% and 44.37% respectively. The task was completed using the da… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 10 publications
(11 citation statements)
references
References 11 publications
1
10
0
Order By: Relevance
“…It also endorsed the findings of [2]. Regressionbased classifier SimpleLogistic and SVM linear outperform the other models on uni-gram as concluded in [16] and achieved 94.2% F-measure value. For combined word ngrams, after uni-gram, uni+bi+tri-gram shows better performance than the other four n-grams.…”
Section: A Word N-gramssupporting
confidence: 72%
See 3 more Smart Citations
“…It also endorsed the findings of [2]. Regressionbased classifier SimpleLogistic and SVM linear outperform the other models on uni-gram as concluded in [16] and achieved 94.2% F-measure value. For combined word ngrams, after uni-gram, uni+bi+tri-gram shows better performance than the other four n-grams.…”
Section: A Word N-gramssupporting
confidence: 72%
“…Several studies show that n-gram approaches at the character and word level are very effective to detect offensive language than Bag of Word (BoW) [2], [15], [23]. [16] explored word n-grams to detect offensive language from tweets. [9], [19] employed word n-grams to detect offensive language from YouTube comments [20], [22], [24] also used word n-grams to detect offensive language from the comments collected from blogs and emails.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Some examples of such widely known terms used to threaten are "Blood, kill, murder, death, and stab". To detect threatening language, the computational linguistics community has been focusing on online platforms like YouTube, Twitter, [25] English BoW, char n-grams SVM, LR, CNN YouTube [26] English BoW, word n-grams (2, 3, 5) SVM, NB YouTube [15] English BoW, GloVe, fastText 1D-CNN, LSTM, and BiLSTM Twitter [27] English unigram SVM, CNN, BiLSTM Twitter [28] English word n-grams (1−8) SVM Twitter [6] English Latent Dirichlet Allocation (LDA) LR Online Comments [1] English word n-grams, char n-grams NB, SVM Twitter [29] English word n-gram (3−8), char n-grams (1−3) CNN, RNN, RF, NB, SVM Twitter [30] English word n-grams SVM (linear, polynomial, radial) Twitter, Articles [31] English abusive and non-abusive word list k-means Twitter, Blogs [32] English, Portuguese hateword2vec, hatedoc2vec, unigrams NB, SVM YouTube [22] Arabic word n-gram SVM Twitter [33] Spanish word n-grams, char n-grams LR Twitter [34] Indonesian Latent Dirichlet Allocation (LDA) -Twitter, Facebook, Reddit [35] Danish, English char n-grams LR, BiLSTM Twitter [17] German Wikipedia embedding CNN Twitter [19] Italian BERT tokens AlBERTo Blogs [36] Japanese word n-grams (1-5) SVM Facebook [37] Bengla word n-grams (1-3) MNB, SVM, CNN, LSTM Twitter, Instagram [37] Turkish -MNB, SVM, DT (C4.5), KNN Facebook, Instagram, and blogs [2, 22, 24-28, 31, 36, 38-41].…”
Section: B Datasets and Approaches To Threatening Language Detectionmentioning
confidence: 99%