The ability to rapidly and flexibly adapt decisions to available rewards is crucial for survival in dynamic environments. Rewardbased decisions are guided by reward expectations that are updated based on prediction errors, and processing of these errors involves dopaminergic neuromodulation in the striatum. To test the hypothesis that the COMT gene Val 158 Met polymorphism leads to interindividual differences in reward-based learning, we used the neuromodulatory role of dopamine in signaling prediction errors. We show a behavioral advantage for the phylogenetically ancestral Val/Val genotype in an instrumental reversal learning task that requires rapid and flexible adaptation of decisions to changing reward contingencies in a dynamic environment. Implementing a reinforcement learning model with a dynamic learning rate to estimate prediction error and learning rate for each trial, we discovered that a higher and more flexible learning rate underlies the advantage of the Val/Val genotype. Model-based fMRI analysis revealed that greater and more differentiated striatal fMRI responses to prediction errors reflect this advantage on the neurobiological level. Learning rate-dependent changes in effective connectivity between the striatum and prefrontal cortex were greater in the Val/Val than Met/Met genotype, suggesting that the advantage results from a downstream effect of the prefrontal cortex that is presumably mediated by differences in dopamine metabolism. These results show a critical role of dopamine in processing the weight a particular prediction error has on the expectation updating for the next decision, thereby providing important insights into neurobiological mechanisms underlying the ability to rapidly and flexibly adapt decisions to changing reward contingencies.COMT ͉ dopamine ͉ functional MRI ͉ learning rate ͉ reinforcement learning H uman learning spans a wide range of functions from ancient evolutionary accomplishments [e.g., fast and intuitive learning from rewards (1, 2)], to complex executive processes that require high capacities of attention and working memory (3, 4). These functions are supported by interacting brain regions that comprise phylogenetically ancient structures (e.g., the striatum) as well as more recently evolved neocortical structures [e.g., the prefrontal cortex (PFC)]. Still, different learning processes rely on the same neuromodulators. Dopamine (DA) modulates synaptic efficacy in both the striatum and PFC (5) and is thus involved in different kinds of learning. Accordingly, interindividual differences in dopaminergic neuromodulatory mechanisms contribute to performance differences in both simple instrumental learning tasks and complex cognitive tasks (6-10).One interindividual difference in dopaminergic neuromodulation that has been linked to PFC function is the uniquely human Val 158 Met polymorphism in the enzyme catechol-O-methyltransferase (COMT), which regulates extrasynaptical DA degradation (8-10). COMT activity is lower in individuals homozygous for the mutated allele...
Learning by following explicit advice is fundamental for human cultural evolution, yet the neurobiology of adaptive social learning is largely unknown. Here, we used simulations to analyze the adaptive value of social learning mechanisms, computational modeling of behavioral data to describe cognitive mechanisms involved in social learning, and model-based functional magnetic resonance imaging (fMRI) to identify the neurobiological basis of following advice. One-time advice received before learning had a sustained influence on people's learning processes. This was best explained by social learning mechanisms implementing a more positive evaluation of the outcomes from recommended options. Computer simulations showed that this “outcome-bonus” accumulates more rewards than an alternative mechanism implementing higher initial reward expectation for recommended options. fMRI results revealed a neural outcome-bonus signal in the septal area and the left caudate. This neural signal coded rewards in the absence of advice, and crucially, it signaled greater positive rewards for positive and negative feedback after recommended rather than after non-recommended choices. Hence, our results indicate that following advice is intrinsically rewarding. A positive correlation between the model's outcome-bonus parameter and amygdala activity after positive feedback directly relates the computational model to brain activity. These results advance the understanding of social learning by providing a neurobiological account for adaptive learning from advice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.