53rd IEEE Conference on Decision and Control 2014
DOI: 10.1109/cdc.2014.7039601
|View full text |Cite
|
Sign up to set email alerts
|

Reachability-based safe learning with Gaussian processes

Abstract: Abstract-Reinforcement learning for robotic applications faces the challenge of constraint satisfaction, which currently impedes its application to safety critical systems. Recent approaches successfully introduce safety based on reachability analysis, determining a safe region of the state space where the system can operate. However, overly constraining the freedom of the system can negatively affect performance, while attempting to learn less conservative safety constraints might fail to preserve safety if t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
174
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 223 publications
(175 citation statements)
references
References 14 publications
1
174
0
Order By: Relevance
“…Since the control u * S (w) drives the state w to f ∆t (w, u * S (w), d(w)) then ∃u(·) that can keep the state trajectory starting at w out of K for all time. 2 This leads to a contradiction, and w must indeed be a safe state. Furthermore, if V k+1 S (w) = 0 then (17) applies the control u * S (w), which keeps the state trajectory in the safe set.…”
Section: B Temporal Updates For Reachability Analysismentioning
confidence: 96%
See 3 more Smart Citations
“…Since the control u * S (w) drives the state w to f ∆t (w, u * S (w), d(w)) then ∃u(·) that can keep the state trajectory starting at w out of K for all time. 2 This leads to a contradiction, and w must indeed be a safe state. Furthermore, if V k+1 S (w) = 0 then (17) applies the control u * S (w), which keeps the state trajectory in the safe set.…”
Section: B Temporal Updates For Reachability Analysismentioning
confidence: 96%
“…However, more time will be needed to update mis-labeled safe states as safe. 2 Here we are assuming that if the state trajectory enters K in between samples then is stays there.…”
Section: B Temporal Updates For Reachability Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…These approaches either require a safety function to avoid dangerous states while exploring [6], [7], or use simulators to confirm the safety of states [8]. In our case, the decision maker does not have any prior information about the risky states that may be found, but in contrast, every time a dead-end is reached, it will learn the causes and avoid them in the future.…”
Section: Introductionmentioning
confidence: 99%