2014
DOI: 10.1037/a0037015
|View full text |Cite
|
Sign up to set email alerts
|

Opponent actor learning (OpAL): Modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive.

Abstract: The striatal dopaminergic system has been implicated in reinforcement learning (RL), motor performance, and incentive motivation. Various computational models have been proposed to account for each of these effects individually, but a formal analysis of their interactions is lacking. Here we present a novel algorithmic model expanding the classical actor-critic architecture to include fundamental interactive properties of neural circuit models, incorporating both incentive and learning effects into a single th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

40
475
1

Year Published

2014
2014
2019
2019

Publication Types

Select...
7
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 390 publications
(516 citation statements)
references
References 71 publications
(204 reference statements)
40
475
1
Order By: Relevance
“…One interpretation of this result is that increased dopamine produces increased salience for positive feedback. This interpretation is consistent with evidence that the FRN is influenced by the salience of outcomes (eg, Pfabigan et al, 2015), that increased striatal dopamine is associated with heightened sensitivity to rewards and decreased sensitivity to punishments (Collins and Frank, 2014), and that evidence of increased sensitivity to rewards is associated with increased FRN for unexpected rewards (Smillie et al, 2011). The reversal learning task (RLT) developed by Cools et al (2009) involves periodic, unexpected positive and negative feedback, with increased striatal dopamine (either baseline levels or due to pharmacological manipulation) associated with behavioral evidence of better learning after unexpected reward relative to unexpected punishment (Cools et al, 2009;van der Schaaf et al, 2014).…”
supporting
confidence: 72%
“…One interpretation of this result is that increased dopamine produces increased salience for positive feedback. This interpretation is consistent with evidence that the FRN is influenced by the salience of outcomes (eg, Pfabigan et al, 2015), that increased striatal dopamine is associated with heightened sensitivity to rewards and decreased sensitivity to punishments (Collins and Frank, 2014), and that evidence of increased sensitivity to rewards is associated with increased FRN for unexpected rewards (Smillie et al, 2011). The reversal learning task (RLT) developed by Cools et al (2009) involves periodic, unexpected positive and negative feedback, with increased striatal dopamine (either baseline levels or due to pharmacological manipulation) associated with behavioral evidence of better learning after unexpected reward relative to unexpected punishment (Cools et al, 2009;van der Schaaf et al, 2014).…”
supporting
confidence: 72%
“…Indeed, an extension of the model with only the direct pathway (Lo and Wang, 2006) that incorporates the dopamine system and rewarddependent plasticity at corticostriatal projections was recently presented (Hsiao and Lo, 2013). Related existing work includes a cortex-BG model implementing optimal decision making for multiple choices (Bogacz and Gurney, 2007) and reinforcement learning (Bogacz and Larsen, 2011) and an extended actor-critic model providing a unified description of reinforcement learning and choice incentive, where the direct and indirect pathway striatal neurons play the roles of two opponent actors and exhibit reward-dependent learning (Collins and Frank, 2014).…”
Section: Discussionmentioning
confidence: 99%
“…Figure 5 shows the effects of individual differences in the genetic polymorphism for DARPP-32, which primarily affects striatal dopaminergic functioning associated with D1 and D2 pathways related to learning from positive and negative action values [26][27][28][29]45 . Absent any differences in overall task performance (easy contrast not significant), the presence of a DARPP-32 C allele was associated with an increased C4B bias in the conflict cost contrast (T/T N ¼ 35, T/C N ¼ 26, C/C N ¼ 22; t 82 ¼ 1.99, P ¼ 0.05), consistent with prior studies linking this allele to a bias towards reward-based learning and choice.…”
Section: Study I Participants and Taskmentioning
confidence: 99%
“…Dopamine bursts in the cortico-striatal D1 direct pathway underlies ability to learn from and seek reward, whereas dopamine dips in the D2-mediated indirect pathway underlies the ability to learn from and avoid punishment [22][23][24][25] . Computational models show how the striatal D1 and D2 pathways come to represent values and costs in such tasks, and that choices in reward-based tasks are best described by an opponent process whereby each choice option has a corresponding positive (D1) and negative (D2) action value 26 . The dopamine-and cyclic AMP-regulated phosphoprotein (DARPP-32) has been used as a marker for cortico-striatal plasticity, where an increasing number of T alleles predict an imbalance in learning favouring D1 relative to D2 pathways.…”
mentioning
confidence: 99%