The 2020 Conference on Artificial Life 2020
DOI: 10.1162/isal_a_00318
|View full text |Cite
|
Sign up to set email alerts
|

Safe Reinforcement Learning through Meta-learned Instincts

Abstract: An important goal in reinforcement learning is to create agents that can quickly adapt to new goals while avoiding situations that might cause damage to themselves or their environments. One way agents learn is through exploration mechanisms, which are needed to discover new policies. However, in deep reinforcement learning, exploration is normally done by injecting noise in the action space. While performing well in many domains, this setup has the inherent risk that the noisy actions performed by the agent l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…The instinctual network is aware of the action a P i as well as the state observation s i at step i, creating the instinct state observation s I i := s i , a P i . This is in contrast to our previous MLIN approach (Grbic and Risi, 2020), in which the instinct co-evolved to expect what kind of behavior the policy performs around hazards and therefore did not need a P i as input. In our IR 2 L approach, the instinct needs to work with a random policy on a task where hazards could be distributed differently than during pretraining; the instinct needs to know what the policy wants to execute so it can modulate it accordingly.…”
Section: Approach: Instinct Regulated Reinforcement Learningmentioning
confidence: 60%
See 1 more Smart Citation
“…The instinctual network is aware of the action a P i as well as the state observation s i at step i, creating the instinct state observation s I i := s i , a P i . This is in contrast to our previous MLIN approach (Grbic and Risi, 2020), in which the instinct co-evolved to expect what kind of behavior the policy performs around hazards and therefore did not need a P i as input. In our IR 2 L approach, the instinct needs to work with a random policy on a task where hazards could be distributed differently than during pretraining; the instinct needs to know what the policy wants to execute so it can modulate it accordingly.…”
Section: Approach: Instinct Regulated Reinforcement Learningmentioning
confidence: 60%
“…In this paper, we are building on the Meta-Learned Instinctual Network (MLIN) approach (Grbic and Risi, 2020), where a policy neural network is split into two major components: a main network trained for a specific task, and a fixed pre-trained instinctual network that transfers between tasks and overrides the main policy if the agent is about to execute a dangerous action. However, meta-learning can be quite expensive since it relies on two nested learning loops: an inner task-specific loop and an outer meta-learning loop.…”
Section: Introductionmentioning
confidence: 99%
“…Work in transfer learning has leveraged meta-RL [14] for safe adaptation [18,32,30]. Our work is also related to curriculum learning [5,51,33].…”
Section: Related Workmentioning
confidence: 99%
“…By contrast, MESA explicitly reasons about safety constraints in the environment to learn adaptable risk measures. Additionally, while prior work has also explored using meta-learning in the context of safe-RL [24], specifically by learning a single safety filter which keeps policies adapted for different tasks safe, we instead adapt the risk measure itself to unseen dynamics and fault structures.…”
Section: A42 Meta Reinforcement Learningmentioning
confidence: 99%