How Does Counterfactually Augmented Data Impact Models for Social Computing Constructs?

Sen, Indira; Samory, Mattia; Floeck, Fabian; Wagner, Claudia; Augenstein, Isabelle

doi:10.18653/v1/2021.emnlp-main.28

Cited by 6 publications

(2 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As mentioned above, model bias will be a particular issue when considering data from participants belonging to different communities whereby use of identity terms by one group will systematically alter their results; unlike with lexicon-based approaches, it is not always easy to identify which terms will lead to bias without testing with a data set such as ours. There is a significant body of work looking to develop less biased language models for a range of tasks (Dixon et al, 2018; Liang et al, 2020; Schick et al, 2021; Ungless et al, 2022; Webster et al, 2021; Zhao et al, 2018), for example, using counterfactually augmented data (Sen et al, 2021), which researchers with the right technical skills may be able to adopt, where they have access to the original model. However, for those who must rely on third party tools, our findings suggest marginalised individuals continue to be impacted by predictive bias despite the likely use of debiasing strategies, and in particular the least salient identities.…”

Section: Discussion and Limitationsmentioning

confidence: 99%

Potential Pitfalls With Automatic Sentiment Analysis: The Example of Queerphobic Bias

Roß

Belle

2023

Social Science Computer Review

View full text Add to dashboard Cite

Automated sentiment analysis can help efficiently detect trends in patients’ moods, consumer preferences, political attitudes and more. Unfortunately, like many natural language processing techniques, sentiment analysis can show bias against marginalised groups. We illustrate this point by showing how six popular sentiment analysis tools respond to sentences about queer identities, expanding on existing work on gender, ethnicity and disability. We find evidence of bias against several marginalised queer identities, including in the two models from Google and Amazon that seem to have been subject to superficial debiasing. We conclude with guidance on selecting a sentiment analysis tool to minimise the risk of model bias skewing results.

show abstract

Section: Discussion and Limitationsmentioning

confidence: 99%

Potential Pitfalls With Automatic Sentiment Analysis: The Example of Queerphobic Bias

Roß

Belle

2023

Social Science Computer Review

View full text Add to dashboard Cite

show abstract

“…Since the model observes the same scenario in the doubled (for binary gender) sentences, it can learn to abstract away from the entities to the context [Emami et al 2019]. This method has shown encouraging results in mitigating bias in contextualised word representations such as ELMo and monolingual BERT [Bartl et al 2020;de Vassimon Manela et al 2021;Sen et al 2021;, and for hate speech detection [Park et al 2018]. Nonetheless, collecting annotated lists for gender-specific pairs can be expensive, and the method essentially doubles the size of the training data.…”

Section: Debiasing Using Data Manipulationmentioning

confidence: 99%

A Survey on Gender Bias in Natural Language Processing

Stanczak¹,

Augenstein²

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Language can be used as a means of reproducing and enforcing harmful stereotypes and biases and has been analysed as such in numerous research. In this paper, we present a survey of 304 papers on gender bias in natural language processing. We analyse definitions of gender and its categories within social sciences and connect them to formal definitions of gender bias in NLP research. We survey lexica and datasets applied in research on gender bias and then compare and contrast approaches to detecting and mitigating gender bias. We find that research on gender bias suffers from four core limitations. 1) Most research treats gender as a binary variable neglecting its fluidity and continuity. 2) Most of the work has been conducted in monolingual setups for English or other high-resource languages. 3) Despite a myriad of papers on gender bias in NLP methods, we find that most of the newly developed algorithms do not test their models for bias and disregard possible ethical considerations of their work. 4) Finally, methodologies developed in this line of research are fundamentally flawed covering very limited definitions of gender bias and lacking evaluation baselines and pipelines. We see overcoming these limitations as a necessary development in future research.

show abstract

Executive Summary

Atanasova

2024

Accountable and Explainable Methods for Complex Reasoning Over Text

View full text Add to dashboard Cite

How Does Counterfactually Augmented Data Impact Models for Social Computing Constructs?

Cited by 6 publications

References 39 publications

Potential Pitfalls With Automatic Sentiment Analysis: The Example of Queerphobic Bias

Potential Pitfalls With Automatic Sentiment Analysis: The Example of Queerphobic Bias

A Survey on Gender Bias in Natural Language Processing

Executive Summary

Contact Info

Product

Resources

About