Safer reinforcement learning through evolved instincts

Grbic, Djordje; Risi, Sebastian

doi:10.1145/3377929.3389946

Cited by 3 publications

(4 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The instinctual network is aware of the action a P i as well as the state observation s i at step i, creating the instinct state observation s I i := s i , a P i . This is in contrast to our previous MLIN approach (Grbic and Risi, 2020), in which the instinct co-evolved to expect what kind of behavior the policy performs around hazards and therefore did not need a P i as input. In our IR 2 L approach, the instinct needs to work with a random policy on a task where hazards could be distributed differently than during pretraining; the instinct needs to know what the policy wants to execute so it can modulate it accordingly.…”

Section: Approach: Instinct Regulated Reinforcement Learningmentioning

confidence: 60%

“…In this paper we are building on the Meta-Learned Instinctual Network (MLIN) approach (Grbic and Risi, 2020), where a policy neural network is split into two major components: a main network trained for a specific task, and a fixed pre-trained instinctual network that transfers between tasks and overrides the main policy if the agent is about to execute a dangerous action. However, meta-learning can be quite expensive since it relies on two nested learning loops: an inner task-specific loop and an outer meta-learning loop.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Safer Reinforcement Learning through Transferable Instinct Networks

Grbic

Risi

2021

The 2021 Conference on Artificial Life

Self Cite

View full text Add to dashboard Cite

Random exploration is one of the main mechanisms through which reinforcement learning (RL) finds well-performing policies. However, it can lead to undesirable or catastrophic outcomes when learning online in safety-critical environments. In fact, safe learning is one of the major obstacles towards real-world agents that can learn during deployment. One way of ensuring that agents respect hard limitations is to explicitly configure boundaries in which they can operate. While this might work in some cases, we do not always have clear a-priori information which states and actions can lead dangerously close to hazardous states. Here, we present an approach where an additional policy can override the main policy and offer a safer alternative action. In our instinctregulated RL (IR 2 L) approach, an "instinctual" network is trained to recognize undesirable situations, while guarding the learning policy against entering them. The instinct network is pre-trained on a single task where it is safe to make mistakes, and transferred to environments in which learning a new task safely is critical. We demonstrate IR 2 L in the Ope-nAI Safety gym domain, in which it receives a significantly lower number of safety violations during training than a baseline RL approach while reaching similar task performance.

show abstract

Section: Approach: Instinct Regulated Reinforcement Learningmentioning

confidence: 60%

Section: Introductionmentioning

confidence: 99%

Safer Reinforcement Learning through Transferable Instinct Networks

Grbic

Risi

2021

The 2021 Conference on Artificial Life

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, such approaches may require ad hoc tuning of the constraint violation reward and may result in unsafe decisions during the exploration phase. In the second category, the safety of the decisions is promoted by offline (batch) learning to initialize the exploration [16] or by the transfer of expert knowledge learned offline to guide the exploration [17]- [19]. Despite significant improvements, these approaches cannot provide safety guarantees and are not suitable for fully online learning.…”

Section: Introductionmentioning

confidence: 99%

Safe Reinforcement Learning for Strategic Bidding of Virtual Power Plants in Day-Ahead Markets

Stanojev,

Mitridati,

Prata

et al. 2023

2023 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)

View full text Add to dashboard Cite

“…However, such approaches may require adhoc tuning of the constraint violation reward and may result in unsafe decisions during the exploration phase. In the second category, safety of the decisions is promoted by offline (batch) learning to initialize the exploration [12], or by the transfer of expert knowledge learned offline to guide the exploration [13]- [15]. Despite significant improvements, these approaches cannot provide theoretical safety guarantees, and are not suitable for fully online learning.…”

Section: Introductionmentioning

confidence: 99%

A Reinforcement Learning Approach for Fast Frequency Control in Low-Inertia Power Systems

Stanojev

Ognjen

Markovic

et al. 2021

2020 52nd North American Power Symposium (NAPS)

View full text Add to dashboard Cite

The electric grid is undergoing a major transition from fossil fuel-based power generation to renewable energy sources, typically interfaced to the grid via power electronics. The future power systems are thus expected to face increased control complexity and challenges pertaining to frequency stability due to lower levels of inertia and damping. As a result, the frequency control and development of novel ancillary services is becoming imperative. This paper proposes a data-driven control scheme, based on Reinforcement Learning (RL), for grid-forming Voltage Source Converters (VSCs), with the goal of exploiting their fast response capabilities to provide fast frequency control to the system. A centralized RL-based controller collects generator frequencies and adjusts the VSC power output, in response to a disturbance, to prevent frequency threshold violations. The proposed control scheme is analyzed and its performance evaluated through detailed time-domain simulations of the IEEE 14-bus test system.

show abstract

Safer reinforcement learning through evolved instincts

Cited by 3 publications

References 2 publications

Safer Reinforcement Learning through Transferable Instinct Networks

Safer Reinforcement Learning through Transferable Instinct Networks

Safe Reinforcement Learning for Strategic Bidding of Virtual Power Plants in Day-Ahead Markets

A Reinforcement Learning Approach for Fast Frequency Control in Low-Inertia Power Systems

Contact Info

Product

Resources

About