The influential notion that the hippocampus supports associative memory by interacting with functionally distinct and distributed brain regions has not been directly tested in humans. We therefore used targeted noninvasive electromagnetic stimulation to modulate human cortical-hippocampal networks and tested effects of this manipulation on memory. Multi-session stimulation increased functional connectivity among distributed cortical-hippocampal network regions and concomitantly improved associative memory performance. These alterations involved localized long-term plasticity, because increases were highly selective to the targeted brain regions, and enhancements of connectivity and associative memory persisted for ~24 hours following stimulation. Targeted cortical-hippocampal networks can thus be enhanced noninvasively, demonstrating their role in associative memory.
Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker. This progress has drawn the attention of cognitive scientists interested in understanding human learning. However, the concern has been raised that deep RL may be too sample-inefficientthat is, it may simply be too slowto provide a plausible model of how humans learn. In the present review, we counter this critique by describing recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods. Although these techniques were developed in an AI context, we propose that they may have rich implications for psychology and neuroscience. A key insight, arising from these AI methods, concerns the fundamental connection between fast RL and slower, more incremental forms of learning. Powerful but Slow: The First Wave of Deep RL Over just the past few years, revolutionary advances have occurred in artificial intelligence (AI) research, where a resurgence in neural network or 'deep learning' methods [1,2] has fueled breakthroughs in image understanding [3,4], natural language processing [5,6], and many other areas. These developments have attracted growing interest from psychologists, psycholinguists, and neuroscientists, curious about whether developments in AI might suggest new hypotheses concerning human cognition and brain function [7-11]. One area of AI research that appears particularly inviting from this perspective is deep RL (Box 1). Deep RL marries neural network modeling (see Glossary) with reinforcement learning, a set of methods for learning from rewards and punishments rather than from more explicit instruction [12]. After decades as an aspirational rather than practical idea, deep RL has within the past 5 years exploded into one of the most intense areas of AI research, generating superhuman performance in tasks from video games [13] to poker [14], multiplayer contests [15], and complex board games, including go and chess [16-19]. Highlights Recent AI research has given rise to powerful techniques for deep reinforcement learning. In their combination of representation learning with rewarddriven behavior, deep reinforcement learning would appear to have inherent interest for psychology and neuroscience.
1Over the past twenty years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. In the present work, we draw on recent advances in artificial intelligence to introduce a new theory of rewardbased learning. Here, the dopamine system trains another part of the brain, the prefrontal cortex, to operate as its own free-standing learning system. This new perspective accommodates the findings that motivated the standard model, but also deals gracefully with a wider range of observations, providing a fresh foundation for future research.Exhilarating advances have recently been made toward understanding the mechanisms involved in reward-driven learning. This progress has been enabled in part by the importation of ideas from the field of reinforcement learning 1 (RL). Most centrally, this input has led to an RL-based theory of dopaminergic function. Here, phasic dopamine (DA) release is interpreted as conveying a reward prediction error (RPE) signal [2][3][4] , an index of surprise which figures centrally in temporal-difference RL algorithms 1 . Under the theory, the RPE drives synaptic plasticity in the striatum, translating experienced action-reward associations into optimized behavioral policies 4,5 . Over the past two decades, evidence has steadily mounted for this proposal, establishing it as the standard model of reward-driven learning.However, even as this standard model has solidified, a collection of problematic observations has accumulated. One quandary arises from research on prefrontal cortex (PFC). A growing body of evidence suggests that PFC implements mechanisms for reward-based learning, performing computations that strikingly resemble those ascribed to DA-based RL. It has long been established that sectors of the PFC represent the expected values of actions, objects and states [6][7][8] . More recently, it has emerged that PFC also encodes the recent history of actions and rewards [9][10][11][12][13][14][15] . The set of variables encoded, along with observations concerning the temporal profile of neural activation in the PFC, has led to the conclusion that "PFC neurons dynamically [encode] conversions from reward and choice history to object value, and from object value to object choice" 10 . In short, neural activity in PFC appears to reflect a set of operations that together constitute a self-contained RL algorithm.Placing PFC beside DA, we obtain a picture containing two full-fledged RL systems, one utilizing activity-based representations and the other synaptic learning. What is the relationship between these systems? If both support RL, are their functions simply redundant? One suggestion has been that DA and PFC subserve different forms of learning, with DA implementing model-free RL, based on direct stim...
Over the past 20 years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. We now draw on recent advances in artificial intelligence to introduce a new theory of reward-based learning. Here, the dopamine system trains another part of the brain, the prefrontal cortex, to operate as its own free-standing learning system. This new perspective accommodates the findings that motivated the standard model, but also deals gracefully with a wider range of observations, providing a fresh foundation for future research.
In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.