Kellie Webster scite author profile

Coreference resolution is an important task for natural language understanding, and the resolution of ambiguous pronouns a longstanding challenge. Nonetheless, existing corpora do not capture ambiguous pronouns in sufficient volume or diversity to accurately indicate the practical utility of models. Furthermore, we find gender bias in existing corpora and systems favoring masculine entities. To address this, we present and release GAP, a gender-balanced labeled corpus of 8,908 ambiguous pronoun-name pairs sampled to provide diverse coverage of challenges posed by real-world text. We explore a range of baselines which demonstrate the complexity of the challenge, the best achieving just 66.9% F1. We show that syntactic structure and continuous neural models provide promising, complementary cues for approaching the challenge.

show abstract

Underspecification Presents Challenges for Credibility in Modern Machine Learning

D’Amour¹,

Heller²,

Moldovan³

et al. 2020

Preprint

125

View full text Add to dashboard Cite

Social Biases in NLP Models as Barriers for Persons with Disabilities

Hutchinson¹,

Prabhakaran²,

Denton³

et al. 2020

170

102

View full text Add to dashboard Cite

Building equitable and inclusive NLP technologies demands consideration of whether and how social attitudes are represented in ML models. In particular, representations encoded in models often inadvertently perpetuate undesirable social biases from the data on which they are trained. In this paper, we present evidence of such undesirable biases towards mentions of disability in two different English language models: toxicity prediction and sentiment analysis. Next, we demonstrate that the neural embeddings that are the critical first step in most NLP pipelines similarly contain undesirable biases towards mentions of disability. We end by highlighting topical biases in the discourse about disability which may contribute to the observed model biases; for instance, gun violence, homelessness, and drug addiction are over-represented in texts discussing mental illness.

show abstract

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Du¹,

Dai²,

Tong³

et al. 2021

Preprint

View full text Add to dashboard Cite

Unintended machine learning biases as social barriers for persons with disabilitiess

Hutchinson

Prabhakaran

Denton

et al. 2020

SIGACCESS Access. Comput.

View full text Add to dashboard Cite

Persons with disabilities face many barriers to full participation in society, and the rapid advancement of technology has the potential to create ever more. Building equitable and inclusive technologies for people with disabilities demands paying attention to more than accessibility, but also to how social attitudes towards disability are represented within technology. Representations perpetuated by machine learning (ML) models often inadvertently encode undesirable social biases from the data on which they are trained. This can result, for example, in text classification models producing very different predictions for I am a person with mental illness , and I am a tall person . In this paper, we present evidence of such biases in existing ML models, and in data used for model development. First, we demonstrate that a machine-learned model to moderate conversations classifies texts which mention disability as more "toxic". Similarly, a machine-learned sentiment analysis model rates texts which mention disability as more negative. Second, we demonstrate that neural text representation models that are critical to many ML applications can also contain undesirable biases towards mentions of disabilities. Third, we show that the data used to develop such models reflects topical biases in social discourse which may explain such biases in the models - for instance, gun violence, homelessness, and drug addiction are over-represented in discussions about mental illness.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kellie Webster

Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns

Underspecification Presents Challenges for Credibility in Modern Machine Learning

Social Biases in NLP Models as Barriers for Persons with Disabilities

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Unintended machine learning biases as social barriers for persons with disabilitiess

Contact Info

Product

Resources

About