Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH) 2022
DOI: 10.18653/v1/2022.woah-1.22
|View full text |Cite
|
Sign up to set email alerts
|

Targeted Identity Group Prediction in Hate Speech Corpora

Abstract: The past decade has seen an abundance of work seeking to detect, characterize, and measure online hate speech. A related, but less studied problem, is the specification of identity groups targeted by that hate speech. Predictive accuracy on this task can supplement additional analyses beyond hate speech detection, motivating its study. Using the Measuring Hate Speech corpus, which provided annotations for targeted identity groups on roughly 50,000 social media comments, we create neural network models to perfo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 29 publications
0
13
0
Order By: Relevance
“…exam, survey, or annotation guidelines), and collected responses can be transformed into a numerical scale capable of measuring the underlying phenomenon. Measurement scales, heavily used in fields such as education, [31] have recently been leveraged in natural language processing, such as hate speech research [14,16]. Further refinement of this construct could pave the way to tools that support measurement of racism narratives in the medical literature at scale and over time.…”
Section: Future Researchmentioning
confidence: 99%
See 1 more Smart Citation
“…exam, survey, or annotation guidelines), and collected responses can be transformed into a numerical scale capable of measuring the underlying phenomenon. Measurement scales, heavily used in fields such as education, [31] have recently been leveraged in natural language processing, such as hate speech research [14,16]. Further refinement of this construct could pave the way to tools that support measurement of racism narratives in the medical literature at scale and over time.…”
Section: Future Researchmentioning
confidence: 99%
“…Our research aims to fill this important gap by providing a comprehensive and data-driven approach to studying racism narratives, essential for shedding light on the mechanisms perpetuating health inequities. Leveraging techniques developed in computational grounded theory and measurement theory [13][14][15][16] we aim to provide a thorough assessment of the discourse around racism as a root cause of health inequities in leading medical journals. The objective of this study was to develop a framework of racism narratives by categorizing and ordering narratives with excerpts from influential medical studies.…”
Section: Introductionmentioning
confidence: 99%
“…Measuring Hate Speech 2020 & 2022 (Kennedy et al 2020;Sachdeva et al 2022): Collaborative endeavours ended up in the compilation of a dataset sourced from three diverse platforms: Twitter, Reddit, and YouTube. To assess the hatefulness of the content, the authors opted for a linear hate speech scale, employing Rasch item response theory (IRT).…”
Section: Data Acquisition and Preparationmentioning
confidence: 99%
“…Akhtar, Basile, and Patti (2020) showed that a strong perspectivist approach to model training may also lead to performance improvements. Similarly, Kocoń et al (2021) proposed to leverage non-aggregated data to train models adapted to different users, in what they call "humancentered approach"; Sudre et al ( 2019), Gordon et al (2022) and Guan et al (2018) proposed multi-task approaches to deal with observer variability and dissenting voices, showing how jointly learning the consensus process and the individual raters' labels improves classification accuracy and representation; Sachdeva et al (2022a) and Kralj Novak et al (2022) showed how accounting for disagreements among raters may more accurately represent performance of ML models in hate speech detection and also improve the identification of target groups; similarly, Rodrigues and Pereira (2018) proposed a novel deep learning model that by internally capturing the reliability and biases of different annotators achieves state-of-the-art results for various crowdsourced datasets; Peterson et al (2019) showed that accounting for raters' disagreement and uncertainty may lead to generalizability and performance improvements in CV tasks; Uma et al (2020) proposed the use of soft losses as a perspectivist approach for the training of ML models in NLP tasks, while Campagner et al (2021) proposed a soft loss ensemble learning method, inspired by possibility theory and three-way decisions, for the training of ML models in perspectivist settings; similarly, Washington et al (2021) showed how the use of soft-labels, that is distributions over labels obtained by means of crowdsourcing, could be useful to better account for the subjectivity of human interpretation in emotion recognition tasks.…”
Section: Review Of Perspectivist Approaches In Aimentioning
confidence: 99%