2021
DOI: 10.48550/arxiv.2110.09338
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Contextual Hate Speech Detection in Code Mixed Text using Transformer Based Approaches

Abstract: In the recent past, social media platforms have helped people in connecting and communicating to a wider audience. But this has also led to a drastic increase in cyberbullying. It is essential to detect and curb hate speech to keep the sanity of social media platforms. Also, code mixed text containing more than one language is frequently used on these platforms. We, therefore, propose automated techniques for hate speech detection in code mixed text from scraped Twitter. We specifically focus on code mixed Eng… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…In this section, we will try to mainly discuss previous attempts in the creation of code-mixed datasets. User-generated content is the main source of codemixed data, and preprocessing is necessary for tasks like profanity hate speech [20,3,11,22,17], sentiment analysis, etc. Various attempts of scraping have been done before for the initial set of code-mixed data and later augmented synthetically using equivalence constraint theory [19], semi-supervised learning [9] and rule-based language-pair approaches [26].…”
Section: Related Workmentioning
confidence: 99%
“…In this section, we will try to mainly discuss previous attempts in the creation of code-mixed datasets. User-generated content is the main source of codemixed data, and preprocessing is necessary for tasks like profanity hate speech [20,3,11,22,17], sentiment analysis, etc. Various attempts of scraping have been done before for the initial set of code-mixed data and later augmented synthetically using equivalence constraint theory [19], semi-supervised learning [9] and rule-based language-pair approaches [26].…”
Section: Related Workmentioning
confidence: 99%
“…The state-of-the-art hostile text detection methods are available for English language. To enhance the research in low resource Indian languages, we have studied various methods which can detect hostile posts in Hindi (1, 2, 6-8, 13, 14), Marathi (7,11,12), Bengali (12), Saudi (4), Roman Urdu (15,16), Tamil (17) and codemixed language (7,(18)(19)(20)(21)(22). Figure 1 shows basic approaches used for hostile post-detection.…”
Section: Introductionmentioning
confidence: 99%