2018
DOI: 10.1101/295964
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Prefrontal Cortex as a Meta-Reinforcement Learning System

Abstract: 1Over the past twenty years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. In the present work, we draw on recent advances in artificial intelligence to introduce a new theory of rewardbased learning. Here, the d… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

15
294
1
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 179 publications
(311 citation statements)
references
References 69 publications
15
294
1
1
Order By: Relevance
“…Beyond the algorithms examined in our paper, there are a number of other important ideas that have been successful in machine learning. For example, hierarchical RL algorithms learn policy primitives that combine to produce solutions to different tasks, 11 while meta-RL algorithms learn a learning algorithm that can adapt quickly to new tasks, 12,13 echoing the classic formation of task sets described by Harlow. 14 Of particular note is the development of hybrids of UVFAs and SF&GPI known as universal successor features approximators, which combine the benefits of both approaches.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Beyond the algorithms examined in our paper, there are a number of other important ideas that have been successful in machine learning. For example, hierarchical RL algorithms learn policy primitives that combine to produce solutions to different tasks, 11 while meta-RL algorithms learn a learning algorithm that can adapt quickly to new tasks, 12,13 echoing the classic formation of task sets described by Harlow. 14 Of particular note is the development of hybrids of UVFAs and SF&GPI known as universal successor features approximators, which combine the benefits of both approaches.…”
Section: Discussionmentioning
confidence: 99%
“…Yang and colleagues 15 trained a single recurrent neural network to solve multiple related tasks and observed the emergence of functionally specialized clusters of neurons, mixed selectivity neurons like those found in macaque prefrontal cortex, as well as compositional task representations reminiscent of hierarchical RL. Wang and colleagues 12 proposed how meta-RL might be implemented in the brain, with dopamine signals gradually training a separate learning algorithm in the prefrontal cortex, which in turn can rapidly adapt to changing task demands.…”
Section: Discussionmentioning
confidence: 99%
“…Another recent study used a meta-learning approach to model DA activity and activity in the prefrontal cortex (PFC) of mammals (Wang et al, 2018). Unlike our study, in which the "slow" optimization is taken to represent evolutionary and developmental processes that determine the MB output circuitry, in this study the slow component of learning involved DA-dependent optimization of recurrent connections in PFC.…”
Section: Relationship To Other Modeling Approachesmentioning
confidence: 99%
“…Both MBRL and MFRL suffer from this limitation. For example, Q-learning (Watkins & Dayan, 1992) (Wang et al, 2018). MBRL fares little better.…”
Section: Reinforcement Learningmentioning
confidence: 99%