2021
DOI: 10.48550/arxiv.2109.13006
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RuleBert: Teaching Soft Rules to Pre-trained Language Models

Abstract: While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a class… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 27 publications
0
3
0
Order By: Relevance
“…Embedding-based methods first convert symbolic facts and rules to embeddings and then apply neural network layers on top to softly predict answers. Recent work in deductive reasoning focused on tasks where rules and facts are expressed in natural language (Talmor et al, 2020;Saeed et al, 2021;Clark et al, 2020b;Kassner et al, 2020). Such tasks are more challenging because the model has to first understand the logic described in the natural language sentences before performing logical reasoning.…”
Section: Related Workmentioning
confidence: 99%
“…Embedding-based methods first convert symbolic facts and rules to embeddings and then apply neural network layers on top to softly predict answers. Recent work in deductive reasoning focused on tasks where rules and facts are expressed in natural language (Talmor et al, 2020;Saeed et al, 2021;Clark et al, 2020b;Kassner et al, 2020). Such tasks are more challenging because the model has to first understand the logic described in the natural language sentences before performing logical reasoning.…”
Section: Related Workmentioning
confidence: 99%
“…Chain of thought sequence modeling. The idea of decomposing multi-step problems into intermediate steps (the so-called chain of thought [58]) and learning the intermediate steps using a sequence model has been applied to domain specific problems such as program induction [59], learning to solve math problems [60], learning to execute [61], learning to reason [62,63,64,65,66,67], and language model prompting [58]. The chain of thought imitation learning problem we formulate is domain agnostic and applicable to many sequential decision making task traditionally solved by imitation learning in a Markovian setting such as robot locomotion, navigation, manipulation, and strategy games.…”
Section: Related Workmentioning
confidence: 99%
“…For example, there is work that uses discrete parses to template neural network components (Arabshahi et al, 2018;Mao et al, 2019;Yi et al, 2018). There is also work that seeks to embed symbolic knowledge into network parameters via special loss functions (Xu et al, 2018;Seo et al, 2021) or carefully curated datasets (Lample and Charton, 2019;Clark et al, 2020;Saeed et al, 2021) and architectures (?). Other related work seeks to incorporate logical constraints into text generation models .…”
Section: Related Workmentioning
confidence: 99%