Solving Simple Stochastic Games with Few Coin Toss Positions

Ibsen-Jensen, Rasmus; Miltersen, Peter Bro

doi:10.1007/978-3-642-33090-2_55

Cited by 15 publications

(20 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…gave a randomized algorithm with expected running time |V R |!|V | O(1) . Ibsen-Jensen and Miltersen [IJM12] improved these bounds by showing that a variant of value iteration solves SSGs in time in O(|V R |2 |VR| (|V R | log |V R | + |V |)). For BW-games several pseudo-polynomial and subexponential algorithms are known [GKK88, KL93, ZP96, Pis99, BV01a, BV01b, HBV04, BV05, BV07, Hal07, Vor08]; see also [JPZ06] for parity games.…”

Section: Introduction 1basic Conceptsmentioning

confidence: 99%

A Pseudo-Polynomial Algorithm for Mean Payoff Stochastic Games with Perfect Information and a Few Random Positions

Boros

Elbassioni²,

Gurvich

et al. 2013

Automata, Languages, and Programming

View full text Add to dashboard Cite

We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G = (V, E), with local rewards r : E → Z, and three types of positions: black VB, white VW , and random VR forming a partition of V . It is a longstanding open question whether a polynomial time algorithm for BWR-games exists, or not, even when |VR| = 0. In fact, a pseudo-polynomial algorithm for BWR-games would already imply their polynomial solvability. In this paper, we show that BWR-games with a constant number of random positions can be solved in pseudo-polynomial time. More precisely, in any BWR-game with |VR| = O(1), a saddle point in uniformly optimal pure stationary strategies can be found in time polynomial in |VW | + |VB|, the maximum absolute local reward, and the common denominator of the transition probabilities.

show abstract

Section: Introduction 1basic Conceptsmentioning

confidence: 99%

A Pseudo-Polynomial Algorithm for Mean Payoff Stochastic Games with Perfect Information and a Few Random Positions

Boros

Elbassioni²,

Gurvich

et al. 2013

Automata, Languages, and Programming

View full text Add to dashboard Cite

show abstract

“…Proof Since there is an optimal Markov strategy, there is a counter-based strategy, which uses memory at most log T . As shown by Ibsen-Jensen and Miltersen [5] for any game G T , if the horizon is greater than 2 log ǫ −1 2 n , the value of G T approximates the value of G with in ǫ. It is clear that the value of all states are the same in an infinite-horizon game if either player is forced to play an optimal strategy.…”

Section: Definitionsmentioning

confidence: 78%

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

Chatterjee

Ibsen-Jensen

2013

Mathematical and Engineering Methods in Computer Science

Self Cite

View full text Add to dashboard Cite

Markov decision processes (MDPs) and simple stochastic games (SSGs) provide a rich mathematical framework to study many important problems related to probabilistic systems. MDPs and SSGs with finite-horizon objectives, where the goal is to maximize the probability to reach a target state in a given finite time, is a classical and well-studied problem. In this work we consider the strategy complexity of finite-horizon MDPs and SSGs. We show that for all ǫ > 0, the natural class of counter-based strategies require at most log log( 1 ǫ ) + n + 1 memory states, and memory of size Ω(log log( 1 ǫ ) + n) is required, for ǫ-optimality, where n is the number of states of the MDP (resp. SSG). Thus our bounds are asymptotically optimal. We then study the periodic property of optimal strategies, and show a sub-exponential lower bound on the period for optimal strategies.

show abstract

“…There are various improvements with smaller dependence on k [9,15,20,23] (note that even though BWR-games are polynomially reducible to simple stochastic games, under this reduction the number of random positions does not stay constant, but is only polynomially bounded in n, even if the original BWRgame had a constant number of random positions). Recently, a pseudo-polynomial algorithm was given for BWR-games with a constant number of random positions and polynomial common denominator of transition probabilities, but under the assumption that the game is ergodic (that is, the value does not depend on the ini-tial position) [5].…”

Section: Previous Resultsmentioning

confidence: 99%

Approximation Schemes for Stochastic Mean Payoff Games with Perfect Information and Few Random Positions

et al. 2017

View full text Add to dashboard Cite

show abstract

Solving Simple Stochastic Games with Few Coin Toss Positions

Cited by 15 publications

References 16 publications

A Pseudo-Polynomial Algorithm for Mean Payoff Stochastic Games with Perfect Information and a Few Random Positions

A Pseudo-Polynomial Algorithm for Mean Payoff Stochastic Games with Perfect Information and a Few Random Positions

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

Approximation Schemes for Stochastic Mean Payoff Games with Perfect Information and Few Random Positions

Contact Info

Product

Resources

About