Optimistic and Topological Value Iteration for Simple Stochastic Games

Muqsit, Azeem,; Evangelidis, Alexandros; Křetínský, Jan; Slivinskiy, Alexander; Weininger, Maximilian

doi:10.1007/978-3-031-19992-9_18

Cited by 9 publications

(11 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, when we extracted the induced MDPs, we found them all easy for VI. Similarly, [3] used a random generation of SGs of at most 10,000 states, many of which were challenging for the SG algorithms. Yet the same random generation modified to produce MDPs delivered only MDPs easily solved in seconds, even with drastically increased numbers of states.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

A Practitioner’s Guide to MDP Model Checking Algorithms

Hartmanns

Junges

Quatmann

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Model checking undiscounted reachability and expected-reward properties on Markov decision processes (MDPs) is key for the verification of systems that act under uncertainty. Popular algorithms are policy iteration and variants of value iteration; in tool competitions, most participants rely on the latter. These algorithms generally need worst-case exponential time. However, the problem can equally be formulated as a linear program, solvable in polynomial time. In this paper, we give a detailed overview of today’s state-of-the-art algorithms for MDP model checking with a focus on performance and correctness. We highlight their fundamental differences, and describe various optimizations and implementation variants. We experimentally compare floating-point and exact-arithmetic implementations of all algorithms on three benchmark sets using two probabilistic model checkers. Our results show that (optimistic) value iteration is a sensible default, but other algorithms are preferable in specific settings. This paper thereby provides a guide for MDP verification practitioners—tool builders and users alike.

show abstract

Section: Discussionmentioning

confidence: 99%

“…This uniformity may be misleading. Indeed, for some stochastic game algorithms, using LP to solve the underlying MDPs may be preferential [3,Appendix E.4]. An application in runtime assurance preferred PI for numerical stability [45,Sect.…”

Section: Introductionmentioning

confidence: 99%

A Practitioner’s Guide to MDP Model Checking Algorithms

Hartmanns

Junges

Quatmann

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Kleene's fixpoint theorem suggests a simple method for approximating the lfp µφ from below : Simply iterate φ starting at 0, i.e., compute the sequence l 0 = 0, l 1 = φ(l 0 ), l 2 = φ(l 1 ), etc. 1 In the context of MDP, this iterative scheme is known as Value Iteration (VI). VI is easy to implement, but it is difficult to decide when to stop the iteration.…”

Section: The Optimistic Value Iteration Algorithmmentioning

confidence: 99%

“…In a nutshell, the idea of OVI is to compute some lower bound l on the solution-which can be done using an approximative iterative algorithm-and then optimistically guess an upper bound u = l + ε and verify that the guess was correct. Prior to our paper, OVI had only been considered in Markov Decision Processes (MDP) [22] and Stochastic Games (SG) [1], where it is used to compute bounds on, e.g., maximal reachability probabilities. The upper bounds computed by OVI have a special property: They are self-certifying (also called inductive in our paper): Given the system and the bounds, one can check very easily that the bounds are indeed correct.…”

Section: Introductionmentioning

confidence: 99%

Certificates for Probabilistic Pushdown Automata via Optimistic Value Iteration

Winkler

Katoen

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Probabilistic pushdown automata (pPDA) are a standard model for discrete probabilistic programs with procedures and recursion. In pPDA, many quantitative properties are characterized as least fixpoints of polynomial equation systems. In this paper, we study the problem of certifying that these quantities lie within certain bounds. To this end, we first characterize the polynomial systems that admit easy-to-check certificates for validating bounds on their least fixpoint. Second, we present a sound and complete Optimistic Value Iteration algorithm for computing such certificates. Third, we show how certificates for polynomial systems can be transferred to certificates for various quantitative pPDA properties. Experiments demonstrate that our algorithm computes succinct certificates for several intricate example programs as well as stochastic context-free grammars with $$> 10^4$$ > 10 4 production rules.

show abstract

“…This uniformity may be misleading. Indeed, for stochastic games and a particular technique, using LP to solve the underlying MDPs may be preferential [3,Appendix E.4]. For examples in runtime assurance, numerical instability meant that PI was preferred [32,Sect.…”

Section: Introductionmentioning

confidence: 99%

A Practitioner's Guide to MDP Model Checking Algorithms

Hartmanns¹,

Junges²,

Quatmann³

et al. 2023

Preprint

View full text Add to dashboard Cite

Model checking undiscounted reachability and expected-reward properties on Markov decision processes (MDPs) is key for the verification of systems that act under uncertainty. Popular algorithms are policy iteration and variants of value iteration; in tool competitions, most participants rely on the latter. These algorithms generally need worst-case exponential time. However the problem can equally be formulated as a linear program, solvable in polynomial time. In this paper, we give a detailed overview of today's state-of-the-art algorithms for MDP model checking with a focus on performance and correctness. We highlight their fundamental differences, and describe various optimisations and implementation variants. We experimentally compare floating-point and exact-arithmetic implementations of all algorithms on three benchmark sets using two probabilistic model checkers. Our results show that (optimistic) value iteration is a sensible default, but other algorithms are preferable in specific settings. This paper thereby provides a guide for MDP verification practitioners-tool builders and users alike.

show abstract

Optimistic and Topological Value Iteration for Simple Stochastic Games

Cited by 9 publications

References 26 publications

A Practitioner’s Guide to MDP Model Checking Algorithms

A Practitioner’s Guide to MDP Model Checking Algorithms

Certificates for Probabilistic Pushdown Automata via Optimistic Value Iteration

A Practitioner's Guide to MDP Model Checking Algorithms

Contact Info

Product

Resources

About