Social IQa: Commonsense Reasoning about Social Interactions

Sap, Maarten; Rashkin, Hannah; Chen, Derek; Bras, Ronan Le; Choi, Yejin

doi:10.18653/v1/d19-1454

Cited by 290 publications

(308 citation statements)

References 29 publications

Supporting

Mentioning

301

Contrasting

Order By: Relevance

“…The model generates the correct stereotypes when there is high lexical overlap with the post (e.g., examples d and e). This is in line with previous research showing that large language models rely on correlational patterns in data (Sap et al, 2019c;Sakaguchi et al, 2020).…”

Section: Classification Shown Insupporting

confidence: 93%

Social Bias Frames: Reasoning about Social and Power Implications of Language

Sap

Gabriel

Qin

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

197

190

View full text Add to dashboard Cite

Warning: this paper contains content that may be offensive or upsetting. We then establish baseline approaches that learn to recover SOCIAL BIAS FRAMES from unstructured text. We find that while stateof-the-art neural models are effective at highlevel categorization of whether a given statement projects unwanted social bias (80% F 1), they are not effective at spelling out more detailed explanations in terms of SOCIAL BIAS FRAMES. Our study motivates future work that combines structured pragmatic inference with commonsense reasoning on social implications.

show abstract

Section: Classification Shown Insupporting

confidence: 93%

Social Bias Frames: Reasoning about Social and Power Implications of Language

Sap

Gabriel

Qin

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

197

190

View full text Add to dashboard Cite

show abstract

“…Based on the protocols we introduce, we show that performing at state-of-theart on these datasets does not necessarily imply strong common-sense reasoning capability. We are happy to see a rising interest in the WSC in the community, including very recent work by Ruan et al (2019) and Sap et al (2019), which reinforces the need for proper evaluation protocols. With the release of an increasing number of finegrained inference tasks aimed at these abilities (Roemmele et al, 2011;Morgenstern et al, 2016;Wang et al, 2018;Rashkin et al, 2018;McCann et al, 2018), the issue of experimental validity in CSR will also become even more important.…”

Section: Resultsmentioning

confidence: 80%

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Trichelair¹,

Emami

Trischler

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challenge (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the results of previous work by analyzing threats to the validity of previous experimental designs. Our protocols account for several properties prevalent in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty.

show abstract

“…But despite these impressive performance improvements in a variety of NLP tasks, it remains unclear whether these models are performing complex reasoning, or if they are merely learning complex surface correlation patterns (Davis and Marcus, 2015;Marcus, 2018). This difficulty in measuring the progress in commonsense reasoning using downstream tasks has yielded increased efforts at developing robust benchmarks for directly measuring commonsense capabilities in multiple settings, such as social interactions (Sap et al, 2019b;Rashkin et al, 2018a) and physical situations (Zellers et al, 2019;Talmor et al, 2019).…”

Section: Type Of the Tutorialmentioning

confidence: 99%

“…In response, recent work has focused on using crowdsourcing and automatic filtering to design large-scale benchmarks while maintaining negative examples that are adversarial to machines (Zellers et al, 2018). We will review recent benchmarks that have emerged to assess whether machines have acquired physical (e.g., Talmor et al, 2019;Zellers et al, 2019), social (e.g., Sap et al, 2019b), or temporal commonsense reasoning capabilities (e.g., , as well as benchmarks that combine commonsense abilities with other tasks (e.g., reading comprehension; Ostermann et al, 2018;.…”

Section: Descriptionmentioning

confidence: 99%

See 1 more Smart Citation

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

2020

View full text Add to dashboard Cite

While deep learning has transformed the natural language processing (NLP) field and impacted the larger computational linguistics community, the rise of neural networks is stained by their opaque nature: It is challenging to interpret the inner workings of neural network models, and explicate their behavior. Therefore, in the last few years, an increasingly large body of work has been devoted to the analysis and interpretation of neural network models in NLP.This body of work is so far lacking a common framework and methodology. Moreover, approaching the analysis of modern neural networks can be difficult for newcomers to the field. This tutorial aims to fill this gap and introduce the nascent field of interpretability and analysis of neural networks in NLP. This tutorial will cover cutting-edge research in interpretability and analysis of modern neural NLP models. The topic has not been previously covered in *CL tutorials.

show abstract

Social IQa: Commonsense Reasoning about Social Interactions

Cited by 290 publications

References 29 publications

Social Bias Frames: Reasoning about Social and Power Implications of Language

Social Bias Frames: Reasoning about Social and Power Implications of Language

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

Contact Info

Product

Resources

About