2020
DOI: 10.48550/arxiv.2012.02565
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Automated Detection of Cyberbullying Against Women and Immigrants and Cross-domain Adaptability

Abstract: Cyberbullying is a prevalent and growing social problem due to the surge of social media technology usage. Minorities, women, and adolescents are among the common victims of cyberbullying. Despite the advancement of NLP technologies, the automated cyberbullying detection remains challenging. This paper focuses on advancing the technology using state-of-the-art NLP techniques. We use a Twitter dataset from SemEval 2019 -Task 5 (HatEval) on hate speech against women and immigrants. Our best performing ensemble m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
9
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(9 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…Herath et al [63] developed and evaluated a cyberbullying classification model using DistilBERT and state-of-the-art NLP technology. The dataset collected from…”
Section: Distilbertmentioning
confidence: 99%
See 2 more Smart Citations
“…Herath et al [63] developed and evaluated a cyberbullying classification model using DistilBERT and state-of-the-art NLP technology. The dataset collected from…”
Section: Distilbertmentioning
confidence: 99%
“…Three models which were built on a Training dataset by changing the ratios of the majority classes acts as base models. The final ensemble model was built using a Simple Voting Classifier [63]. Since the dataset used in this thesis is comparatively balanced, a simple DistilBERT model was developed and the results are discussed below:…”
Section: Evaluating Distilbert Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…The TF-IDF algorithm is based on word statistics for text feature extraction. TF-IDF is used to vectorize the input [1]. The model considers only the expression of words, that are similar in all texts.…”
Section: Tf-idfmentioning
confidence: 99%
“…This has resulted in bullying general or specific users and user groups, either knowingly or unknowingly. The abuses resulting from cyberbullying can cause psychological harm to the target users and groups [1].…”
Section: Introductionmentioning
confidence: 99%