2023
DOI: 10.31234/osf.io/ymve5
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Zero-shot compositional reasoning in a reinforcement learning setting

Abstract: People can easily evoke previously learned concepts, compose them, and apply the result to solve novel tasks on the first attempt. The aim of this paper is to improve our understanding of how people make such zero-shot compositional inferences in a reinforcement learning setting. To achieve this, we introduce an experimental paradigm where people learn two latent reward functions and need to compose them correctly to solve a novel task. We find that people have the capability to engage in zero-shot composition… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 55 publications
0
1
0
Order By: Relevance
“…Furthermore, they used their knowledge to generalize to feature values unobserved during learning. Jagadish et al (2023) pushed this a step further; they showed that people are able to generalize to a composition of functions (i.e., adding a periodic function to a linear function) mapping response keys to rewards with remarkable accuracy on the first trial with practice only on the individual functions but without practice on the composite function.…”
Section: Evidence In Favor Of the Generalization Artistmentioning
confidence: 99%
“…Furthermore, they used their knowledge to generalize to feature values unobserved during learning. Jagadish et al (2023) pushed this a step further; they showed that people are able to generalize to a composition of functions (i.e., adding a periodic function to a linear function) mapping response keys to rewards with remarkable accuracy on the first trial with practice only on the individual functions but without practice on the composite function.…”
Section: Evidence In Favor Of the Generalization Artistmentioning
confidence: 99%