ESANN 2021 Proceedings 2021
DOI: 10.14428/esann/2021.es2021-48
|View full text |Cite
|
Sign up to set email alerts
|

Toxicity Detection in Online Comments with Limited Data: A Comparative Analysis

Abstract: We present a comparative study on toxicity detection, focusing on the problem of identifying toxicity types of low prevalence and possibly even unobserved at training time. For this purpose, we train our models on a dataset that contains only a weak type of toxicity, and test whether they are able to generalize to more severe toxicity types. We find that representation learning and ensembling exceed the classification performance of simple classifiers on toxicity detection, while also providing significantly b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 6 publications
0
1
0
Order By: Relevance
“…In the related OCC setting with its focus on outlier detection, DNN-based approaches have been researched from three angles: (1) combining kernel methods [65] with DNN methods [14,60,71], (2) generative models (e.g., generative adversarial networks [22] or variational autoencoders [35])based outlier detectors [50,64,72], and (3) based on (semi-) supervised autoencoders [8,10,26,44,45,47]. Here, the key idea is to learn a representation of the inlier distribution and subsequently to estimate the outlierness of a sample via its reconstruction error.…”
Section: Related Workmentioning
confidence: 99%
“…In the related OCC setting with its focus on outlier detection, DNN-based approaches have been researched from three angles: (1) combining kernel methods [65] with DNN methods [14,60,71], (2) generative models (e.g., generative adversarial networks [22] or variational autoencoders [35])based outlier detectors [50,64,72], and (3) based on (semi-) supervised autoencoders [8,10,26,44,45,47]. Here, the key idea is to learn a representation of the inlier distribution and subsequently to estimate the outlierness of a sample via its reconstruction error.…”
Section: Related Workmentioning
confidence: 99%