2022
DOI: 10.1613/jair.1.13673
|View full text |Cite
|
Sign up to set email alerts
|

Towards Continual Reinforcement Learning: A Review and Perspectives

Abstract: In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations by mathematically characterizing two key properties of non-stationarity, namely, the scope and driver non-stationarity. This offers a unified view of various formulati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
33
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 93 publications
(33 citation statements)
references
References 255 publications
(199 reference statements)
0
33
0
Order By: Relevance
“…An important benefit of using options for exploration is that, by encoding temporally extended behaviours into a set of options, the agent can later leverage a collection of diverse and purposeful behaviours in other tasks. This is particularly important in the face of non-stationary, or continual learning (Khetarpal et al, 2020), and is in direct contrast to several other exploration techniques. Methods such as count-based or error prediction-based methods are more tied to the agent's state visitation distribution and are not that flexible in the face of non-stationarity.…”
Section: Exploration In the Face Of Non-stationaritymentioning
confidence: 99%
“…An important benefit of using options for exploration is that, by encoding temporally extended behaviours into a set of options, the agent can later leverage a collection of diverse and purposeful behaviours in other tasks. This is particularly important in the face of non-stationary, or continual learning (Khetarpal et al, 2020), and is in direct contrast to several other exploration techniques. Methods such as count-based or error prediction-based methods are more tied to the agent's state visitation distribution and are not that flexible in the face of non-stationarity.…”
Section: Exploration In the Face Of Non-stationaritymentioning
confidence: 99%
“…Deep reinforcement learning approaches have enabled human-like performance on many game tasks and new control policies for complex, high-dimensional spaces. 2 Recently, scaling transformer models has resulted in the creation of large language models and powerful, multi-task foundation models. 3 Figure 1.…”
Section: Introductionmentioning
confidence: 99%
“…The inspiration for many replay-based methods comes from Complementary Learning Systems (CLS) ; Khetarpal et al (2022), which describes learning in mammalian brains. The hippocampus memorises recent observations and replays them to the neocortex, which is a slow statistical learner.…”
Section: Introductionmentioning
confidence: 99%