2022
DOI: 10.1101/2022.04.06.487298
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning to Express Reward Prediction Error-like Dopaminergic Activity Requires Plastic Representations of Time

Abstract: Dopamine (DA) releasing neurons in the midbrain learn response patterns that represent reward prediction error (RPE). Typically, models proposing a mechanistic explanation for how dopamine neurons learn to exhibit RPE are based on temporal difference (TD) learning, a machine learning algorithm. However, mechanistic models motivated by TD learning face two significant hurdles. First, TD-based models typically require rather unrealistic components, such as long and robust temporal chains of feature-specific neur… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

2
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 90 publications
(155 reference statements)
2
2
0
Order By: Relevance
“…These mechanisms may work in tandem with Hebbian plasticity to construct cognitive maps and/or they may be more involved in refining behavior policy and other task-specific functions by selectively routing information from the established cognitive maps to other brain regions mediating behavioral policies. Our data indicate that task representations and behavioral policies based upon them are formed in lockstep, as suggested by previous theory 75 . A likely candidate mechanism for the contribution of synaptic plasticity during feedback is behavioral time scale synaptic plasticity (BTSP) 76 .…”
Section: Discussionsupporting
confidence: 85%
“…These mechanisms may work in tandem with Hebbian plasticity to construct cognitive maps and/or they may be more involved in refining behavior policy and other task-specific functions by selectively routing information from the established cognitive maps to other brain regions mediating behavioral policies. Our data indicate that task representations and behavioral policies based upon them are formed in lockstep, as suggested by previous theory 75 . A likely candidate mechanism for the contribution of synaptic plasticity during feedback is behavioral time scale synaptic plasticity (BTSP) 76 .…”
Section: Discussionsupporting
confidence: 85%
“…It was only for outcome signals recorded in lateral sites that we could detect systematic changes related to changing probabilities of reward. Despite these uncertainties, our observations introduce evidence for dopamine plateau responses as learning-related features to add to transient and ramping responses formerly reported, and raise new questions about RPE encoding by the striatum during learning 39 .…”
Section: Discussionsupporting
confidence: 58%
“…For example, some cholinergic inputs, some likely from these interneurons, generate action potentials in intrastriatal dopamine fibers far from their cell bodies 16 . Further, oscillatory local field potentials can accompany and even modulate activity 39,[51][52][53] . We did not monitor this activity.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation