2022
DOI: 10.48550/arxiv.2205.11502
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the Paradox of Learning to Reason from Data

Abstract: Logical reasoning is needed in a wide range of NLP tasks. Can a BERT model be trained end-to-end to solve logical reasoning problems presented in natural language? We attempt to answer this question in a confined problem space where there exists a set of parameters that perfectly simulates logical reasoning. We make observations that seem to contradict each other: BERT attains near-perfect accuracy on in-distribution test examples while failing to generalize to other data distributions over the exact same prob… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(17 citation statements)
references
References 14 publications
1
16
0
Order By: Relevance
“…We also observe ILLUME to yield no satisfying results on the CLEVR-X dataset. Details in this regard can be found in Appendix G. Summarized, we attribute this behavior to the same observations made by Zhang et al (2022) in that current LMs appear incapable of inferring logical reasoning from a few training examples. Therefore, VLMs bootstrapped from LMs struggle to transfer logical reasoning capabilities without major extensions.…”
Section: (Bottom)supporting
confidence: 68%
See 1 more Smart Citation
“…We also observe ILLUME to yield no satisfying results on the CLEVR-X dataset. Details in this regard can be found in Appendix G. Summarized, we attribute this behavior to the same observations made by Zhang et al (2022) in that current LMs appear incapable of inferring logical reasoning from a few training examples. Therefore, VLMs bootstrapped from LMs struggle to transfer logical reasoning capabilities without major extensions.…”
Section: (Bottom)supporting
confidence: 68%
“…Flaws in Logical Reasoning One frequently observed shortcoming of large neural networks is their inability to generalize to logical reasoning. Zhang et al (2022) recently demonstrated that BERT does not learn logical reasoning but instead captures statistical features in the training data. Therefore, the model remains unable to generalize to other distributions of the exact same problem.…”
Section: (Bottom)mentioning
confidence: 99%
“…Even though the results (in Table 2) showcase an uptick in the number of successful plans, the overall performance is still around 20%. This is unsurprising as [36] point out that language models tend to focus on the inherent statistical features in reasoning problems which affects their performance on such tasks.…”
Section: A Appendix A1 Additional Experimentsmentioning
confidence: 99%
“…After all, modern machine learning algorithms are nothing else than statistical models based on a huge set of matrix multiplications, and there are publications highlighting the inherent limits of such systems. For example, it lies in the nature of their architecture that AI models will never be able to do things that are quite simple for humans, such as making logical inferences or entertain abstract reasoning-because all they do (and this is what they are really good at) is making extrapolations using statistical operations [29,137,138]. However, such claims may be debatable since they strongly depend on what we mean when making reference to terms like "abstract reasoning" or also just "reasoning" per se [26,49].…”
Section: Implications For the (Digital) Humanitiesmentioning
confidence: 99%
“…The most groundbreaking model in April was called Flamingo produced by Google's DeepMind because it combined a 70 Bp language model with a 10 Mb image model [3]. In May, Meta AI released OPT-175B with an open-sourced code ( [137,138]. And just in the past few days at the time of this writing, Google AI published LaMDA 2 [117], DeepMind released an intriguing model called Gato with 1.18 Bp [100], and Google Research published UL20B [113] At the time of this publication, the last few weeks showed groundbreaking innovations in multimodality by combining natural language processing (NLP) with computer vision, the most prominent models are Dall-E 2 by OpenAI [105], Flamingo by DeepMind [3], and a few days ago, Imagen was introduced by Google [105].…”
mentioning
confidence: 99%