2021
DOI: 10.48550/arxiv.2112.09986
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

Abstract: In the current era of the internet, where social media platforms are easily accessible for everyone, people often have to deal with threats, identity attacks, hate, and bullying due to their association with a cast, creed, gender, religion, or even acceptance or rejection of a notion. Existing works in hate speech detection primarily focus on individual comment classification as a sequence labelling task and often fail to consider the context of the conversation. The context of a conversation often plays a sub… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 37 publications
0
1
0
Order By: Relevance
“…About 7000 posts in code-mixed Hindi and English are classi ed as non-hate offensive (NOT) and hate and offensive (HOF). Few studies used this dataset to perform the subtask to classify the posts as NOT and HOF in code mixed language [27][28] [29]. Another dataset is there to detect hate speech and offensive content identi cation in Indo-European languages [15].…”
Section: Related Workmentioning
confidence: 99%
“…About 7000 posts in code-mixed Hindi and English are classi ed as non-hate offensive (NOT) and hate and offensive (HOF). Few studies used this dataset to perform the subtask to classify the posts as NOT and HOF in code mixed language [27][28] [29]. Another dataset is there to detect hate speech and offensive content identi cation in Indo-European languages [15].…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, multilingual transformers, particularly mBERT (multilingual BERT) or XLM-RoBERTa, have been implemented in the multilingual domain for hate speech detection tasks. These models have provided cutting-edge performance in crosslingual and multilingual settings, where several studies demonstrate their usefulness in many languages, especially in low-resource ones [19,20].…”
Section: Introductionmentioning
confidence: 99%
“…The state-of-the-art hostile text detection methods are available for English language. To enhance the research in low resource Indian languages, we have studied various methods which can detect hostile posts in Hindi (1, 2, 6-8, 13, 14), Marathi (7,11,12), Bengali (12), Saudi (4), Roman Urdu (15,16), Tamil (17) and codemixed language (7,(18)(19)(20)(21)(22). Figure 1 shows basic approaches used for hostile post-detection.…”
Section: Introductionmentioning
confidence: 99%