2019 American Control Conference (ACC) 2019
DOI: 10.23919/acc.2019.8815215
|View full text |Cite
|
Sign up to set email alerts
|

Approximate Dynamic Programming with Probabilistic Temporal Logic Constraints

Abstract: In this paper, we develop approximate dynamic programming methods for stochastic systems modeled as Markov Decision Processes, given both soft performance criteria and hard constraints in a class of probabilistic temporal logic called Probabilistic Computation Tree Logic (PCTL). Our approach consists of two steps: First, we show how to transform a class of PCTL formulas into chance constraints that can be enforced during planning in stochastic systems. Second, by integrating randomized optimization and entropy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(8 citation statements)
references
References 26 publications
0
8
0
Order By: Relevance
“…However, if we directly solve for approximately optimal policies in the product MDP using the method in Section III-B, as the reward is sparse, it becomes a rare event to sample a path satisfying the specification. As a consequence, the estimate of the gradient in [14] has a high variance with finite samples. To address this problem, we develop Topological Approximate Dynamic Programming (TADP) that leverages the structure property in the task automaton to improve the convergence due to sparse and temporally extended rewards with LTL specifications.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…However, if we directly solve for approximately optimal policies in the product MDP using the method in Section III-B, as the reward is sparse, it becomes a rare event to sample a path satisfying the specification. As a consequence, the estimate of the gradient in [14] has a high variance with finite samples. To address this problem, we develop Topological Approximate Dynamic Programming (TADP) that leverages the structure property in the task automaton to improve the convergence due to sparse and temporally extended rewards with LTL specifications.…”
Section: Resultsmentioning
confidence: 99%
“…In this section, we present a model-free ADP method for value iteration. The method has been introduced in our previous work [14] and will be briefly reviewed here for completeness.…”
Section: Appendixmentioning
confidence: 99%
See 1 more Smart Citation
“…With abstraction methods such as region automaton [20], we can potentially reduce the size of the approximated finite-state model. We are also considering incorporating approximate dynamic programming [1] for large-scale MDPs to handle the issue of scalability.…”
Section: Discussionmentioning
confidence: 99%
“…The task specifies the assumption about the probabilistic external events and the desired agent's behavior with timing constraints related to the occurrence of the external events. Such problem is widely encountered in robotics [1], [2], and other cyber-physical systems [3].…”
Section: Introductionmentioning
confidence: 99%