2023
DOI: 10.1016/j.automatica.2022.110684
|View full text |Cite
|
Sign up to set email alerts
|

Safe exploration in model-based reinforcement learning using control barrier functions

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 68 publications
0
8
0
Order By: Relevance
“…Alternatively in some works MATLAB/ SIMULINK plat-form is also used for training or evaluating RL agents [111] 3. One crucial observation is that a huge number of work have used customized or adapted environments for training and evaluation and have not used conventional environments [74,24,84].…”
Section: Review Of Simulation/ Evaluation Benchmarksmentioning
confidence: 99%
“…Alternatively in some works MATLAB/ SIMULINK plat-form is also used for training or evaluating RL agents [111] 3. One crucial observation is that a huge number of work have used customized or adapted environments for training and evaluation and have not used conventional environments [74,24,84].…”
Section: Review Of Simulation/ Evaluation Benchmarksmentioning
confidence: 99%
“…In recent years, significant progress has been made in developing safe model-based reinforcement learning (SMBRL) techniques to learn safe controllers for different classes of systems. [6][7][8][9][10][11][12][13][14][15][16] While Markov decision process (MDP) based SMBRL methods have been available for discrete time systems with finite state and action spaces, [6][7][8][9] synthesizing online controllers for systems in continuous time, under output feedback, while guaranteeing stability and safety is still a challenging problem.…”
Section: Introductionmentioning
confidence: 99%
“…For nonlinear systems, Reference 28 develops a safe exploration scheme for jointly learning the dynamics of an uncertain control system and the optimal value function/policy. The proposed approach uses Lyapunov‐like barrier functions 29 to build robust safeguarding controller that can guarantee safety when combined with an arbitrary learning‐based control policy.…”
Section: Introductionmentioning
confidence: 99%