Tobias Larsen scite author profile

The role of neurons in the substantia nigra (SN) and ventral tegmental area (VTA) of the midbrain in contributing to the elicitation of reward prediction errors during appetitive learning has been well established. Less is known about the differential contribution of these midbrain regions to appetitive versus aversive learning, especially in humans. Here we scanned human participants with high-resolution fMRI focused on the SN and VTA while they participated in a sequential Pavlovian conditioning paradigm involving an appetitive outcome (a pleasant juice), as well as an aversive outcome (an unpleasant bitter and salty flavor). We found a degree of regional specialization within the SN: Whereas a region of ventromedial SN correlated with a temporal difference reward prediction error during appetitive Pavlovian learning, a dorsolateral area correlated instead with an aversive expected value signal in response to the most distal cue, and to a reward prediction error in response to the most proximal cue to the aversive outcome. Furthermore, participants' affective reactions to both the appetitive and aversive conditioned stimuli more than 1 year after the fMRI experiment was conducted correlated with activation in the ventromedial and dorsolateral SN obtained during the experiment, respectively. These findings suggest that, whereas the human ventromedial SN contributes to long-term learning about rewards, the dorsolateral SN may be particularly important for long-term learning in aversive contexts.

show abstract

Integration of Reinforcement Learning and Optimal Decision-Making Theories of the Basal Ganglia

Bogacz

Larsen

2011

Neural Computation

View full text Add to dashboard Cite

This article seeks to integrate two sets of theories describing action selection in the basal ganglia: reinforcement learning theories describing learning which actions to select to maximize reward and decision-making theories proposing that the basal ganglia selects actions on the basis of sensory evidence accumulated in the cortex. In particular, we present a model that integrates the actor-critic model of reinforcement learning and a model assuming that the cortico-basal-ganglia circuit implements a statistically optimal decision-making procedure. The values of cortico-striatal weights required for optimal decision making in our model differ from those provided by standard reinforcement learning models. Nevertheless, we show that an actor-critic model converges to the weights required for optimal decision making when biologically realistic limits on synaptic weights are introduced. We also describe the model's predictions concerning reaction times and neural responses during learning, and we discuss directions required for further integration of reinforcement learning and optimal decision-making theories.

show abstract

Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI

et al. 2017

View full text Add to dashboard Cite

Prediction-error signals consistent with formal models of “reinforcement learning” (RL) have repeatedly been found within dopaminergic nuclei of the midbrain and dopaminoceptive areas of the striatum. However, the precise form of the RL algorithms implemented in the human brain is not yet well determined. Here, we created a novel paradigm optimized to dissociate the subtypes of reward-prediction errors that function as the key computational signatures of two distinct classes of RL models—namely, “actor/critic” models and action-value-learning models (e.g., the Q-learning model). The state-value-prediction error (SVPE), which is independent of actions, is a hallmark of the actor/critic architecture, whereas the action-value-prediction error (AVPE) is the distinguishing feature of action-value-learning algorithms. To test for the presence of these prediction-error signals in the brain, we scanned human participants with a high-resolution functional magnetic-resonance imaging (fMRI) protocol optimized to enable measurement of neural activity in the dopaminergic midbrain as well as the striatal areas to which it projects. In keeping with the actor/critic model, the SVPE signal was detected in the substantia nigra. The SVPE was also clearly present in both the ventral striatum and the dorsal striatum. However, alongside these purely state-value-based computations we also found evidence for AVPE signals throughout the striatum. These high-resolution fMRI findings suggest that model-free aspects of reward learning in humans can be explained algorithmically with RL in terms of an actor/critic mechanism operating in parallel with a system for more direct action-value learning.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tobias Larsen

Distinct Contributions of Ventromedial and Dorsolateral Subregions of the Human Substantia Nigra to Appetitive and Aversive Learning

Integration of Reinforcement Learning and Optimal Decision-Making Theories of the Basal Ganglia

Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: evidence from high-resolution fMRI

Contact Info

Product

Resources

About