The 2021 Conference on Artificial Life 2021
DOI: 10.1162/isal_a_00449
|View full text |Cite
|
Sign up to set email alerts
|

Safer Reinforcement Learning through Transferable Instinct Networks

Abstract: Random exploration is one of the main mechanisms through which reinforcement learning (RL) finds well-performing policies. However, it can lead to undesirable or catastrophic outcomes when learning online in safety-critical environments. In fact, safe learning is one of the major obstacles towards real-world agents that can learn during deployment. One way of ensuring that agents respect hard limitations is to explicitly configure boundaries in which they can operate. While this might work in some cases, we do… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 27 publications
(35 reference statements)
0
1
0
Order By: Relevance
“…Importantly, STELLAR integrated 11 innovative components that solve different challenges and requirements for LL. It employed Sliced Cramer Preservation (SCP) (Kolouri et al, 2020), or the sketched version of it (SCP++) (Li et al, 2021), and Complex Synapse Optimizer (Benna and Fusi, 2016) to overcome catastrophic forgetting of old tasks; Self-Preserving World Model (Ketz et al, 2019) and Context-Skill Model (Tutum et al, 2021) for backward transfer to old tasks as well as forward transfer to their variants; Neuromodulated Attention (Zou et al, 2020) for rapid performance recovery when an old task repeats; Modulated Hebbian Network (Ladosz et al, 2022) and Plastic Neuromodulated Network (Ben-Iwhiwhu et al, 2021) for rapid adaptation to new tasks; Reflexive Adaptation (Maguire et al, 2021) and Meta-Learned Instinct Network (Grbic and Risi, 2021) to safely adapt to new tasks; and Probabilistic Program Neurogenesis (Martin and Pilly, 2019) to scale up the learning of new tasks during fielded operation. More details on the precise effect of each of these components are beyond the scope of this paper; however, this case study outlines how the integrated system dynamics demonstrated LL using the proposed metrics, and how these metrics shaped the advancement of the SG-HRL system.…”
Section: System Group Hrl -Carla 531 System Overviewmentioning
confidence: 99%
“…Importantly, STELLAR integrated 11 innovative components that solve different challenges and requirements for LL. It employed Sliced Cramer Preservation (SCP) (Kolouri et al, 2020), or the sketched version of it (SCP++) (Li et al, 2021), and Complex Synapse Optimizer (Benna and Fusi, 2016) to overcome catastrophic forgetting of old tasks; Self-Preserving World Model (Ketz et al, 2019) and Context-Skill Model (Tutum et al, 2021) for backward transfer to old tasks as well as forward transfer to their variants; Neuromodulated Attention (Zou et al, 2020) for rapid performance recovery when an old task repeats; Modulated Hebbian Network (Ladosz et al, 2022) and Plastic Neuromodulated Network (Ben-Iwhiwhu et al, 2021) for rapid adaptation to new tasks; Reflexive Adaptation (Maguire et al, 2021) and Meta-Learned Instinct Network (Grbic and Risi, 2021) to safely adapt to new tasks; and Probabilistic Program Neurogenesis (Martin and Pilly, 2019) to scale up the learning of new tasks during fielded operation. More details on the precise effect of each of these components are beyond the scope of this paper; however, this case study outlines how the integrated system dynamics demonstrated LL using the proposed metrics, and how these metrics shaped the advancement of the SG-HRL system.…”
Section: System Group Hrl -Carla 531 System Overviewmentioning
confidence: 99%