2018
DOI: 10.1007/978-3-030-03421-4_21
|View full text |Cite
|
Sign up to set email alerts
|

Monte Carlo Tree Search for Verifying Reachability in Markov Decision Processes

Abstract: The maximum reachability probabilities in a Markov decision process can be computed using value iteration (VI). Recently, simulation-based heuristic extensions of VI have been introduced, such as bounded real-time dynamic programming (BRTDP), which often manage to avoid explicit analysis of the whole state space while preserving guarantees on the computed result. In this paper, we introduce a new class of such heuristics, based on Monte Carlo tree search (MCTS), a technique celebrated in various machine-learni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 23 publications
0
1
0
Order By: Relevance
“…We are not currently aware of a tool implementation supporting MA in this setting, though. Ashok et al [2] recently showed that MCTS can be combined with the ideas of BRTDP in various ways to obtain different "hybrid" algorithms that provide sound results and perform better than BRTDP alone. MCTS does not apply directly to timed MA settings like checking time-bounded reachability properties due to the need to consider nonmemoryless schedulers.…”
Section: Other Partial-exploration Approaches In Verificationmentioning
confidence: 99%
“…We are not currently aware of a tool implementation supporting MA in this setting, though. Ashok et al [2] recently showed that MCTS can be combined with the ideas of BRTDP in various ways to obtain different "hybrid" algorithms that provide sound results and perform better than BRTDP alone. MCTS does not apply directly to timed MA settings like checking time-bounded reachability properties due to the need to consider nonmemoryless schedulers.…”
Section: Other Partial-exploration Approaches In Verificationmentioning
confidence: 99%