2019
DOI: 10.1101/547406
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generalizing to generalize: when (and when not) to be compositional in task structure learning

Abstract: Humans routinely face novel environments in which they have to generalize in order toact adaptively. However, doing so involves the non-trivial challenge of deciding which aspects of a task domain to generalize. While it is sometimes appropriate to simply re-use a learned behavior, often adaptive generalization entails recombining distinct components of knowledge acquired across multiple contexts. Theoretical work has suggested a computational trade-off in which it can be more or less useful to learn and gener… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
3
2

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 47 publications
0
6
0
Order By: Relevance
“…When a musician picks up a banjo, they may quickly recognize its similarity to other string instruments—even those with alternate tuning—and efficiently learn to play a scale; the same musician may re-use a different structure when attempting to master the accordion. Previous theoretical work relied on non-parametric Bayesian clustering models that assess which of several previously seen structures might apply to a novel situation and be flexibly combined in a compositional fashion [ 3 ], a strategy supported by empirical studies in humans [ 16 ]. However, such an approach still requires the agent to recognize that the specific transition function and/or the reward function is portable to new situations.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…When a musician picks up a banjo, they may quickly recognize its similarity to other string instruments—even those with alternate tuning—and efficiently learn to play a scale; the same musician may re-use a different structure when attempting to master the accordion. Previous theoretical work relied on non-parametric Bayesian clustering models that assess which of several previously seen structures might apply to a novel situation and be flexibly combined in a compositional fashion [ 3 ], a strategy supported by empirical studies in humans [ 16 ]. However, such an approach still requires the agent to recognize that the specific transition function and/or the reward function is portable to new situations.…”
Section: Discussionmentioning
confidence: 99%
“…While the presented reward-predictive model transfers state abstractions across tasks, this model has to re-learn how individual latent states are associated with one-step rewards or SFs for each task. In fact, the presented abstraction transfer models could be combined with prior work [ 3 , 13 , 16 , 31 ] that transfers SFs, latent transition functions, or latent reward functions to integrate the benefits of each transfer system.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Our work motivates the development of targeted experimental designs that would test if human subjects can reuse a latent structure that is present in a set of tasks despite variations in transitions and rewards. For example, one could design a human subject study similar to [16] where participants solve a sequence of grid-world navigation problems, but augment the design to test if subjects reuse a latent structure present in a set of tasks despite variations in transitions and rewards, similar to the task sequence presented in vary depending on whether agents use reward-predictive state abstractions or re-use SR abstractions. Thus, our work provides a concrete testable behavioral prediction that would discriminate between our work and existing work.…”
Section: Discussionmentioning
confidence: 99%