2023
DOI: 10.1101/2023.05.09.540067
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reward-Bases: Dopaminergic Mechanisms for Adaptive Acquisition of Multiple Reward Types

Abstract: Animals can adapt their preferences for different types for reward according to physiological state, such as hunger or thirst. To describe this ability, we propose a simple extension of temporal difference model that learns multiple values of each state according to different reward dimensions such as food or water. By weighting these learned values according to the current needs, behaviour may be flexibly adapted to present demands. Our model predicts that different dopamine neurons should be selective for di… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 78 publications
0
6
0
Order By: Relevance
“…However, recent experimental work expanding the anatomical locations of recordings and the task designs has shown heterogeneity in dopamine responses that is not readily explained within the canonical TD framework 26,28,32,66,77,78 . However, a number of these seemingly anomalous findings can be reconciled and integrated within extensions of the RL framework, further reinforcing the power and versatility of the TD theory in capturing the intricacies of brain learning mechanisms 24,25,29,45,69,72,74,75,79 . In this work, we reveal an additional source of dopaminergic heterogeneity: they encode prediction errors across multiple timescales.…”
Section: Discussionmentioning
confidence: 91%
See 1 more Smart Citation
“…However, recent experimental work expanding the anatomical locations of recordings and the task designs has shown heterogeneity in dopamine responses that is not readily explained within the canonical TD framework 26,28,32,66,77,78 . However, a number of these seemingly anomalous findings can be reconciled and integrated within extensions of the RL framework, further reinforcing the power and versatility of the TD theory in capturing the intricacies of brain learning mechanisms 24,25,29,45,69,72,74,75,79 . In this work, we reveal an additional source of dopaminergic heterogeneity: they encode prediction errors across multiple timescales.…”
Section: Discussionmentioning
confidence: 91%
“…So far, we proposed a descriptive model with a common value function across neurons suggesting that single neurons predictions errors are pooled to create a single value function and world model. Recent models for distributed prediction errors across dopamine neurons have instead used parallel loops where individual neurons contribute to estimating sperate value functions 25,45,[72][73][74][75] . Instead of a common value function, the dopamine neurons can be part of independent loops and share a common expectation of reward timing.…”
Section: Heterogeneity Of Discount Factors Explains Diverse Ramping A...mentioning
confidence: 99%
“…Importantly, showing error signals in this setting goes beyond prior work by us and others showing error signals during learning of specific features of otherwise valuable events (i.e. rewards) 8,12,20,21,51 , which can be explained with adjustments to TDRL algorithms, such as the addition of dissociable "threads", "bases", or "channels" for keeping track of different components of rewarding events [24][25][26] . These models cannot easily explain why neutral cues evoke error signals.…”
Section: Discussionmentioning
confidence: 78%
“…The copyright holder for this preprint this version posted August 25, 2023. ; https://doi.org/10.1101/2023.08. 19.553959 doi: bioRxiv preprint 3 resulting in RPE-signals that are dissociable according to their defining sensory or taskbased properties [24][25][26] . These models are essentially modifications of the current modelfree framework to allow more complexity, diversity, or specificity in the RPE signal, however they remain tied to learning about motivationally significant elements.…”
Section: Introductionmentioning
confidence: 99%
“…While this proposal leaves open questions about how such abstract state representation is implemented biologically (the same being true for ANCCR), it does demonstrate that more complex contingency manipulations can still be explained by TD models. In fact, recent studies have provided evidence for heterogeneous responses to different types of rewards in dopamine neurons [63][64][65] . While further evidence is required to solidify this understanding, the provisional assumption of multiple value channels shows how TD models for multiple outcomes can potentially be achieved in neural circuitry by concurrently running parallel circuits.…”
Section: Limitations Of the Anccr Model As A Model Of Associative Lea...mentioning
confidence: 99%