2022
DOI: 10.1609/aaai.v36i11.21584
|View full text |Cite
|
Sign up to set email alerts
|

On Semantic Cognition, Inductive Generalization, and Language Models

Abstract: My doctoral research focuses on understanding semantic knowledge in neural network models trained solely to predict natural language (referred to as language models, or LMs), by drawing on insights from the study of concepts and categories grounded in cognitive science. I propose a framework inspired by 'inductive reasoning,' a phenomenon that sheds light on how humans utilize background knowledge to make inductive leaps and generalize from new pieces of information about concepts and their properties. Drawing… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…Ultimately, such language representation models would be used in the real world, not only in multiple‐choice settings, but also in so‐called generative settings where the model may be expected to generate answers to questions (without being given options). Even in the multiple‐choice setting, without robust commonsense, the model will likely not be usable for actual decision making unless we can trust that it is capable of generalization (Kejriwal, 2021; Misra, 2022; Wahle et al, 2022). One option to implementing such robustness in practice may be to add a ‘decision‐making layer’ on a pre‐trained language representation model rather than aim to modify the model's architecture from scratch 13 (Hong et al, 2021; Tang & Kejriwal, 2022; Zaib et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…Ultimately, such language representation models would be used in the real world, not only in multiple‐choice settings, but also in so‐called generative settings where the model may be expected to generate answers to questions (without being given options). Even in the multiple‐choice setting, without robust commonsense, the model will likely not be usable for actual decision making unless we can trust that it is capable of generalization (Kejriwal, 2021; Misra, 2022; Wahle et al, 2022). One option to implementing such robustness in practice may be to add a ‘decision‐making layer’ on a pre‐trained language representation model rather than aim to modify the model's architecture from scratch 13 (Hong et al, 2021; Tang & Kejriwal, 2022; Zaib et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…While some work has found that information can be recovered from BERT's token representation [67], the model still has trouble 'understanding' concepts that are relatively natural to humans, such as negation and basic numeracy [25]. Like many other machine learning models, the model can also be overly confident in some of its inputs, and is susceptible to problems of both generalization and adversarial attacks [34,43,[68][69][70]. Furthermore, several experiments have demonstrated that, although BERT effectively encodes information about relations, entity types, relations, semantic roles, as well as proto-roles, it can lose some of its robustness in the face of basic named entity replacements [26].…”
Section: Understanding the Properties Of Language Representation Mode...mentioning
confidence: 99%