Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings

Yu, Sangwon; Song, Jongyoon; Kim, Heeseung; Lee, Seong-min; Ryu, Woo-Jong; Yoon, Sungroh

doi:10.18653/v1/2022.acl-long.3

Cited by 6 publications

(1 citation statement)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lastly, we were only able to evaluate a limited number of word embedding algorithms that account for token frequency issues. Potential alternatives include KAFE (Ashfaq et al, 2022), which relies on a knowledge graph to improve token representations, and AGG (Yu et al, 2022), for which the code was not available at the time of conducting the experiments. Similarly, we chose to fine-tune our BERT model for four epochs in all cases to obtain a comparable setting.…”

Section: Limitationsmentioning

confidence: 99%

No Word Embedding Model Is Perfect: Evaluating the Representation Accuracy for Social Bias in the Media

Spliethöver¹,

Maximilian²,

Wachsmuth³

2022

Preprint

View full text Add to dashboard Cite

News articles both shape and reflect public opinion across the political spectrum. Analyzing them for social bias can thus provide valuable insights, such as prevailing stereotypes in society and the media, which are often adopted by NLP models trained on respective data. Recent work has relied on word embedding bias measures, such as WEAT. However, several representation issues of embeddings can harm the measures' accuracy, including lowresource settings and token frequency differences. In this work, we study what kind of embedding algorithm serves best to accurately measure types of social bias known to exist in US online news articles. To cover the whole spectrum of political bias in the US, we collect 500k articles and review psychology literature with respect to expected social bias. We then quantify social bias using WEAT along with embedding algorithms that account for the aforementioned issues. We compare how models trained with the algorithms on news articles represent the expected social bias. Our results suggest that the standard way to quantify bias does not align well with knowledge from psychology. While the proposed algorithms reduce the gap, they still do not fully match the literature.

show abstract

Section: Limitationsmentioning

confidence: 99%