2023
DOI: 10.1101/2023.04.04.535512
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Emergence of belief-like representations through reinforcement learning

Abstract: To behave adaptively, animals must learn to predict future reward, or value. To do this, animals are thought to learn reward predictions using reinforcement learning. However, in contrast to classical models, animals must learn to estimate value using only incomplete state information. Previous work suggests that animals estimate value in partially observable tasks by first forming "beliefs"---optimal Bayesian estimates of the hidden states in the task. Although this is one way to solve the problem of partial … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 34 publications
0
1
0
Order By: Relevance
“…A teaching signal is only as good as the information it receives about the world. This is increasingly being recognized by the incorporation of channels, features, and bases into temporal difference learning models (Lee et al, 2022;Millidge et al, 2023;Takahashi et al, 2023) and in research positing that dopamine neurons are influenced by internal information, inference, and dynamically evolving beliefs (Hennig et al, 2023;Lak et al, 2017;Nomoto et al, 2010;Papageorgiou et al, 2016;Sadacca et al, 2016;Starkweather et al, 2017;Starkweather et al, 2018;Wassum et al, 2011). Yet these findings are likely only the tip of the iceberg; the results here and in the related studies show experimentally that information from even high-level association cortices is utilized by these neurons.…”
Section: Discussionmentioning
confidence: 64%
“…A teaching signal is only as good as the information it receives about the world. This is increasingly being recognized by the incorporation of channels, features, and bases into temporal difference learning models (Lee et al, 2022;Millidge et al, 2023;Takahashi et al, 2023) and in research positing that dopamine neurons are influenced by internal information, inference, and dynamically evolving beliefs (Hennig et al, 2023;Lak et al, 2017;Nomoto et al, 2010;Papageorgiou et al, 2016;Sadacca et al, 2016;Starkweather et al, 2017;Starkweather et al, 2018;Wassum et al, 2011). Yet these findings are likely only the tip of the iceberg; the results here and in the related studies show experimentally that information from even high-level association cortices is utilized by these neurons.…”
Section: Discussionmentioning
confidence: 64%
“…Hidden state estimation refers to the process by which an animal or agent infers an unobservable state, such as the probability of receiving a reward under a certain set of conditions, in order to choose an optimal behavioral strategy 161 . For example, in our study involving mice, reward-relative sequences could be instrumental in helping the animals discern different reward states, such as whether they are in a state of the task associated with reward A, B, or C. If the hippocampus is able to build a code during learning that allows it to accurately infer such states, then this inference process can generalize to similar future scenarios, allowing the animal to learn which features of an experience predict reward and use these predictions in future 161,162 . Interestingly, however, the density of the reward-relative sequences we observed was highest following the beginning of the zone where animals were most likely to get reward, rather than preceding it.…”
Section: Discussionmentioning
confidence: 99%