2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS) 2020
DOI: 10.1109/iccps48487.2020.00017
|View full text |Cite
|
Sign up to set email alerts
|

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
36
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 38 publications
(36 citation statements)
references
References 27 publications
0
36
0
Order By: Relevance
“…Reinforcement learning [186] (RL) is a samplingbased optimization algorithm that computes optimal policies driven by scalar reward signals. Recently, RL has been extended to work with formal logic [32,33,138,68,98], and automatic structures (ω-automata [64,65] and reward machines [77]) instead of scalar reward signals. A promising future direction is to extend RL-based synthesis to reason with security properties of the system.…”
Section: Future Directionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Reinforcement learning [186] (RL) is a samplingbased optimization algorithm that computes optimal policies driven by scalar reward signals. Recently, RL has been extended to work with formal logic [32,33,138,68,98], and automatic structures (ω-automata [64,65] and reward machines [77]) instead of scalar reward signals. A promising future direction is to extend RL-based synthesis to reason with security properties of the system.…”
Section: Future Directionsmentioning
confidence: 99%
“…In such systems, security verification need to reason with neural networks along with the system dynamics. There is a large body of work [74,1,64,152,115,209,98] in verifying control systems with neural networks using SMT solvers, and will provide a promising avenue of research in developing security verification and synthesis approaches for CPS with neural networks based controllers.…”
Section: Future Directionsmentioning
confidence: 99%
“…Techniques from formal methods can be effectively used to address the problem of principled history-aware reward engineering. For instance, automata-based approaches [31,19,17,25] have been used to solve the issue of non-Markovian objectives for omega-regular objectives for infinite behavior. However, these methods are susceptible to the definition of sparse reward functions.…”
Section: Introductionmentioning
confidence: 99%
“…We, however, focus on (infinite-state) stochastic hybrid systems in continuous time. In the discrete-time setting, Lavaei et al [31] recently proposed a way to learn on an abstraction of the continuous state space that provides guarantees on the error of the final results. Other than that, learning for these systems-in particular in continuous time-has not been well-explored from a formal perspective w.r.t.…”
Section: Introductionmentioning
confidence: 99%