2015
DOI: 10.3758/s13423-014-0790-3
|View full text |Cite
|
Sign up to set email alerts
|

Do learning rates adapt to the distribution of rewards?

Abstract: Studies of reinforcement learning have shown that humans learn differently in response to positive and negative reward prediction errors, a phenomenon that can be captured computationally by positing asymmetric learning rates. This asymmetry, motivated by neurobiological and cognitive considerations, has been invoked to explain learning differences across the lifespan as well as a range of psychiatric disorders. Recent theoretical work, motivated by normative considerations, has hypothesized that the learning … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

11
157
2

Year Published

2015
2015
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 125 publications
(170 citation statements)
references
References 23 publications
11
157
2
Order By: Relevance
“…A consistent feature in the reinforcement learning literature is that learning rates for negative prediction errors are higher than those for positive prediction errors, regardless of the distribution of rewards in the task (25)(26)(27). The results of our Standard conditions are consistent with this pattern: In both the gating and probability models, the learning rate parameter that is operative on miss trials (α miss , α prob ), where the prediction error is always negative, is markedly higher than the learning rates active solely on hit trials (α hit , α payoff ), where prediction errors are primarily positive (Fig.…”
Section: Discussionmentioning
confidence: 92%
“…A consistent feature in the reinforcement learning literature is that learning rates for negative prediction errors are higher than those for positive prediction errors, regardless of the distribution of rewards in the task (25)(26)(27). The results of our Standard conditions are consistent with this pattern: In both the gating and probability models, the learning rate parameter that is operative on miss trials (α miss , α prob ), where the prediction error is always negative, is markedly higher than the learning rates active solely on hit trials (α hit , α payoff ), where prediction errors are primarily positive (Fig.…”
Section: Discussionmentioning
confidence: 92%
“…To investigate whether rewarding outcomes engage DA signaling depending on genotype, we used fMRI. Our prior finding from Fto-deficient mice (Hess et al, 2013) suggested that a lack of Fto specifically impairs D2/3R-mediated autoinhibition of dopaminergic midbrain neurons. Furthermore, ANKK1 genotype modulates midbrain response to rewards in humans (Felsted et al, 2010), and reward prediction errors (PEs) are encoded by phasic dopamine release from neurons in the ventral tegmental area/substantia nigra (VTA/SN) (Schultz et al, 1997;Montague et al, 2004).…”
Section: Introductionmentioning
confidence: 90%
“…Moreover, our recent analysis of Fto-deficient mice revealed that a lack of Fto specifically impairs dopamine receptor D2/3-mediated control of neuronal activation. Here Fto deficiency led to increased 6-methyl adenosine modification of specific mRNAs of critical components of D2/3R-signaling, including that of D3R and the GIRK2-channel, thus reducing their translation and affecting dopamine-dependent regulation of locomotor activity and reward sensitivity (Hess et al, 2013). Consistently, behavioral alterations associated with FTO variants in humans have also been linked to altered dopaminergic transmission (Kenny, 2011b).…”
Section: Introductionmentioning
confidence: 96%
See 2 more Smart Citations