Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.557
|View full text |Cite
|
Sign up to set email alerts
|

HateGAN: Adversarial Generative-Based Data Augmentation for Hate Speech Detection

Abstract: Academia and industry have developed machine learning and natural language processing models to detect online hate speech automatically. However, most of these existing methods adopt a supervised approach that heavily depends on labeled datasets for training. This results in the methods' poor detection performance of the hate speech class as the training datasets are highly imbalanced. In this paper, we propose HateGAN, a deep generative reinforcement learning model, which addresses the challenge of imbalance … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 30 publications
(11 citation statements)
references
References 36 publications
0
11
0
Order By: Relevance
“…This synonym is usually gotten from a dictionary or thesaurus such as WordNet in [12] or calculated using the cosine similarities between the target word and words in a pre-trained word embedding such as Word2Vec in [30,39] or Glove in [12]. Another well-known method is generation of new text using RNN [30] or GPT [12,18,36,42] or GAN [3]. In [44], they applied dependency based embeddings for word substitution to generate text while leveraging textual membership queries.…”
Section: Related Studiesmentioning
confidence: 99%
See 1 more Smart Citation
“…This synonym is usually gotten from a dictionary or thesaurus such as WordNet in [12] or calculated using the cosine similarities between the target word and words in a pre-trained word embedding such as Word2Vec in [30,39] or Glove in [12]. Another well-known method is generation of new text using RNN [30] or GPT [12,18,36,42] or GAN [3]. In [44], they applied dependency based embeddings for word substitution to generate text while leveraging textual membership queries.…”
Section: Related Studiesmentioning
confidence: 99%
“…3 : This contains approximately 25k instances from Twitter in English with labels of Hate, Offensive, or Neither hateful nor offensive[6], more specifically, 5.77%…”
mentioning
confidence: 99%
“…As a type of semi-supervised method, GANs include the generative model, which is mainly used to challenge the discriminator of GANs, while the generative models in some DA methods are directly used to augment training data. Moreover, the generative model of GANS is applied as a DA method in some scenes like [61,119,96,68,109,116], and have demonstrated to be effective for data augmentation purposes.…”
Section: Generative Adversarial Networkmentioning
confidence: 99%
“…The increase in the use of social media outlets has led to exploitation in HS organizations that advocate or criticize hate and offensive materials. Every other day, social media giants continue their efforts to detect and prevent the spread of offensive, anarchical, or propagation HS materials without slowing down [2]. In the face of crises such as COVID-19, which unexpectedly broke out on the world agenda, technology giants had to keep their existing technological structures up-to-date to prevent the spread of sharing HS materials.…”
Section: Introductionmentioning
confidence: 99%