The transient response of dopamine neurons has been described as reward prediction error (RPE), with activation or suppression by events that are better or worse than expected, respectively. However, at least a minority of neurons are activated by aversive or high-intensity stimuli, casting doubt on the generality of RPE in describing the dopamine signal. To overcome limitations of previous studies, we studied neuronal responses to a wider variety of high-intensity and aversive stimuli, and we quantified and controlled aversiveness through a choice task in which macaques sacrificed juice to avoid aversive stimuli. Whereas most previous work has portrayed the RPE as a single impulse or “phase,” here we demonstrate its multiphasic temporal dynamics. Aversive or high-intensity stimuli evoked a triphasic sequence of activation-suppression-activation extending over a period of 40–700 ms. The initial activation at short latencies (40–120 ms) reflected sensory intensity.The influence of motivational value became dominant between 150 and 250 ms, with activation in the case of appetitive stimuli, and suppression in the case of aversive and neutral stimuli. The previously unreported late activation appeared to be a modest “rebound” after strong suppression. Similarly, strong activation by reward was often followed by suppression. We suggest that these “rebounds” may result from overcompensation by homeostatic mechanisms in some cells. Our results are consistent with a realistic RPE, which evolves over time through a dynamic balance of excitation and inhibition
Dopamine neurons of the ventral midbrain have been found to signal a reward prediction error that can mediate positive reinforcement. Despite the demonstration of modest diversity at the cellular and molecular levels, there has been little analysis of response diversity in behaving animals. Here we examine response diversity in rhesus macaques to appetitive, aversive, and neutral stimuli having relative motivational values that were measured and controlled through a choice task. First, consistent with previous studies, we observed a continuum of response variability and an apparent absence of distinct clusters in scatter plots, suggesting a lack of statistically discrete subpopulations of neurons. Second, we found that a group of “sensitive” neurons tend to be more strongly suppressed by a variety of stimuli and to be more strongly activated by juice. Third, neurons in the “ventral tier” of substantia nigra were found to have greater suppression, and a subset of these had higher baseline firing rates and late “rebound” activation after suppression. These neurons could belong to a previously identified subgroup of dopamine neurons that express high levels of H-type cation channels but lack calbindin. Fourth, neurons further rostral exhibited greater suppression. Fifth, although we observed weak activation of some neurons by aversive stimuli, this was not associated with their aversiveness. In conclusion, we find a diversity of response properties, distributed along a continuum, within what may be a single functional population of neurons signaling reward prediction error.
Because most rewarding events are probabilistic and changing, the extinction of probabilistic rewards is important for survival. It has been proposed that the extinction of probabilistic rewards depends on arousal and the amount of learning of reward values. Midbrain dopamine neurons were suggested to play a role in both arousal and learning reward values. Despite extensive research on modeling dopaminergic activity in reward learning (e.g. temporal difference models), few studies have been done on modeling its role in arousal. Although temporal difference models capture key characteristics of dopaminergic activity during the extinction of deterministic rewards, they have been less successful at simulating the extinction of probabilistic rewards. By adding an arousal signal to a temporal difference model, we were able to simulate the extinction of probabilistic rewards and its dependence on the amount of learning. Our simulations propose that arousal allows the probability of reward to have lasting effects on the updating of reward value, which slows the extinction of low probability rewards. Using this model, we predicted that, by signaling the prediction error, dopamine determines the learned reward value that has to be extinguished during extinction and participates in regulating the size of the arousal signal that controls the learning rate. These predictions were supported by pharmacological experiments in rats.
An imbalance in goal-directed and habitual behavioral control is a hallmark of decision-making–related disorders, including addiction. Although external globus pallidus (GPe) is critical for action selection, which harbors enriched astrocytes, the role of GPe astrocytes involved in action-selection strategies remained unknown. Using in vivo calcium signaling with fiber photometry, we found substantially attenuated GPe astrocytic activity during habitual learning compared to goal-directed learning. The support vector machine analysis predicted the behavioral outcomes. Chemogenetic activation of the astrocytes or inhibition of GPe pan-neuronal activities facilitates the transition from habit to goal-directed reward-seeking behavior. Next, we found increased astrocyte-specific GABA (γ-aminobutyric acid) transporter type 3 (GAT3) messenger RNA expression during habit learning. Notably, the pharmacological inhibition of GAT3 occluded astrocyte activation–induced transition from habitual to goal-directed behavior. On the other hand, attentional stimuli shifted the habit to goal-directed behaviors. Our findings suggest that the GPe astrocytes regulate the action selection strategy and behavioral flexibility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.