Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing 2019
DOI: 10.18653/v1/d19-6002
|View full text |Cite
|
Sign up to set email alerts
|

A Hybrid Neural Network Model for Commonsense Reasoning

Abstract: This paper proposes a hybrid neural network (HNN) model for commonsense reasoning. An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERTbased contextual encoder but use different model-specific input and output layers. HNN obtains new state-of-the-art results on three classic commonsense reasoning tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. An ablation study sho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 27 publications
0
12
0
Order By: Relevance
“…Opitz and Frank (2018) is the first work to propose transfer learning from another pronoun resolution dataset such as DPR to WSC. He et al (2019) use a hybrid model of Wang et al (2019b) and Kocijan et al (2019b). Ruan et al (2019) explore BERT's next sentence prediction with finetuning on DPR.…”
Section: Discussionmentioning
confidence: 99%
“…Opitz and Frank (2018) is the first work to propose transfer learning from another pronoun resolution dataset such as DPR to WSC. He et al (2019) use a hybrid model of Wang et al (2019b) and Kocijan et al (2019b). Ruan et al (2019) explore BERT's next sentence prediction with finetuning on DPR.…”
Section: Discussionmentioning
confidence: 99%
“…Traditional attempts on commonsense reasoning usually involve heavy utilization of annotated knowledge bases (KB), rule-based reasoning, or hand-crafted features (Bailey et al, 2015;Schüller, 2014;Sharma et al, 2015). Only very recently and after the success of natural language representation learning, several works proposed to use supervised learning to discover commonsense relationships, achieving state-of-the-art in multiple benchmarks (see, e.g., (Kocijan et al, 2019;He et al, 2019;Ye et al, 2019;Ruan et al, 2019)). As an example, (Kocijan et al, 2019) has proposed to exploit the labels for commonsense reasoning directly and showed that the performance of multiple language models on Winograd consistently and robustly improves when fine-tuned on a similar pronoun disambiguation problem dataset.…”
Section: Previous Workmentioning
confidence: 99%
“…Recently, the research community has experienced an abundance in methods proposing to utilize latest word embedding and language model (LM) technologies for commonsense reasoning (Kocijan et al, 2019;He et al, 2019;Ye et al, 2019;Ruan et al, 2019;Trinh and Le, 2018;Klein and Nabi, 2019). The underlying assumption of these methods is that, since such models are learned on large text corpora (such as Wikipedia), they implicitly capture to a certain degree commonsense knowledge.…”
Section: Introductionmentioning
confidence: 99%
“…Commonsense Reasoning in NLP In addition to common sense datasets, we have also witnessed that commonsense knowledge has been recently explored in different NLP tasks. Just to name a few, Trinh and Le (2018), He et al (2019) and Klein and Nabi (2019) use language models trained on huge text corpora to do inference on the WSC dataset. Ding et al (2019) use commonsense knowledge in Atomic (Sap et al, 2019a) and Event2mind (Rashkin et al, 2018) on downstream tasks such as script event prediction.…”
Section: Commonsense Datasetsmentioning
confidence: 99%