2021
DOI: 10.48550/arxiv.2104.10219
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Scalable Synthesis of Verified Controllers in Deep Reinforcement Learning

Abstract: There has been significant recent interest in devising verification techniques for learning-enabled controllers (LECs) that manage safety-critical systems. Given the opacity and lack of interpretability of the neural policies that govern the behavior of such controllers, many existing approaches enforce safety properties through the use of shields, a dynamic monitoring and repair mechanism that ensures a LEC does not emit actions that would violate desired safety conditions. These methods, however, have shown … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…Most research in safe DRL focuses on enhancing the safety and robustness by reducing potential unsafe actions, including methods for safe monitoring and adversarial training. For example, shielding methods [11], [12], [16] prevent agent from making unsafe actions, on every state. Mandlekar et al [17] used actively chosen adversarial perturbations for robust policy training to improve robustness (resistance to changes) in complex environments.…”
Section: Related Workmentioning
confidence: 99%
“…Most research in safe DRL focuses on enhancing the safety and robustness by reducing potential unsafe actions, including methods for safe monitoring and adversarial training. For example, shielding methods [11], [12], [16] prevent agent from making unsafe actions, on every state. Mandlekar et al [17] used actively chosen adversarial perturbations for robust policy training to improve robustness (resistance to changes) in complex environments.…”
Section: Related Workmentioning
confidence: 99%
“…One is based on model transformation, which transforms the embedded DNN model into an interpretable model such as decision trees and programs [3,32]. Another is to synthesize barrier functions that assist the DNN in decision making can ensure safety during deployment [35,33]. The last is to incorporate the DNN into the system dynamics [14,31].…”
Section: Efficiency and Scalabilitymentioning
confidence: 99%
“…Instead of directly verifying DRL systems, most of the existing approaches rely on transforming them into verifiable models. Representative works include exacting decision trees [3] and programmatic policies [32], synthesizing deterministic programs [35] and linear controllers [33], transforming into hybrid systems [14] and star sets [31]. Although these transformation-based approaches are effective solutions, there are some limitations, e.g., extracted policies may not equivalently represent source neural networks and the properties that can be verified may be limited.…”
Section: Introductionmentioning
confidence: 99%
“…A typical example is autonomous driving, which is arguably still a long way off due to safety concerns [21,39]. Recently, tremendous efforts have been made toward adapting existing and devising new formal methods for DRL systems in order to provide provable safety guarantees [18,25,45,46,51].…”
Section: Introductionmentioning
confidence: 99%