Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.671
|View full text |Cite
|
Sign up to set email alerts
|

Contrastive Self-Supervised Learning for Commonsense Reasoning

Abstract: We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Schema Challenge problems. Our approach exploits the characteristic structure of training corpora related to so-called "trigger" words, which are responsible for flipping the answer in pronoun disambiguation. We achieve such commonsense reasoning by constructing pairwise contrastive auxiliary predictions. To this end, we leverage a mutual exclusive loss regularized by a contrastive margin. Our architecture is based on the recently… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
37
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 45 publications
(37 citation statements)
references
References 14 publications
(16 reference statements)
0
37
0
Order By: Relevance
“…Annotating more shots in the target language is an intuitive solution. Designing task-specific pretraining/finetuning objectives could also be promising (Klein and Nabi, 2020;Ram et al, 2021).…”
Section: Target-adapting Resultsmentioning
confidence: 99%
“…Annotating more shots in the target language is an intuitive solution. Designing task-specific pretraining/finetuning objectives could also be promising (Klein and Nabi, 2020;Ram et al, 2021).…”
Section: Target-adapting Resultsmentioning
confidence: 99%
“…Knowledge-Enriched BERT: Incorporating external knowledge into BERT has been shown to be effective. Such external knowledge includes world (factual) knowledge for tasks such as entity typing and relation classification (Zhang et al, 2019;Peters et al, 2019;Liu et al, 2019a;Xiong et al, 2019), sentiment knowledge for sentiment analysis (Tian et al, 2020;, word sense knowledge for word sense disambiguation (Levine et al, 2019), commonsense knowledge for commonsense reasoning (Klein and Nabi, 2020) and sarcasm generation (Chakrabarty et al, 2020), legal knowledge for legal element extraction (Zhong et al, 2020), numerical skills for numerical reasoning (Geva et al, 2020), and coding knowledge for code generation . Biomedical BERT: BERT can also be enriched with biomedical knowledge via pre-training over biomedical corpora like PubMed, as in BioBERT (Lee et al, 2020), SciBERT (Beltagy et al, 2019), ClinicalBERT (Alsentzer et al, 2019) and Blue-BERT (Peng et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Trinh and Le (2018) use a pre-trained language model to score candidate sentences for the Pronoun Disambiguation and Winograd Schema Challenge (Levesque et al, 2012). Klein and Nabi (2020) use a sentence-level loss to enhance commonsense knowledge in BERT. Mao et al (2019) demonstrate that pre-trained language models fine-tuned on SWAG (Zellers et al, 2018) are able to provide commonsense grounding for story generation.…”
Section: Related Workmentioning
confidence: 99%