2000
DOI: 10.1145/347476.347480
|View full text |Cite
|
Sign up to set email alerts
|

Complexity of finite-horizon Markov decision process problems

Abstract: Controlled stochastic systems occur in science engineering, manufacturing, social sciences, and many other cntexts. If the systems is modeled as a Markov decision process (MDP) and will run ad infinitum , the optimal control policy can be computed in polynomial time using linear programming. The problems considered here assume that the time that the process will run is finite, and based on the size of the input. There are mny factors that compound the complexity of computing the optimal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
104
0

Year Published

2002
2002
2015
2015

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 114 publications
(106 citation statements)
references
References 44 publications
2
104
0
Order By: Relevance
“…The crucial assumption is here the presence of nodes with simultaneous actions. Without such nodes, the solving of such games is well known (see [11,13,12] for more on this).…”
Section: Games With Simultaneous Actions (Gsa)mentioning
confidence: 99%
See 2 more Smart Citations
“…The crucial assumption is here the presence of nodes with simultaneous actions. Without such nodes, the solving of such games is well known (see [11,13,12] for more on this).…”
Section: Games With Simultaneous Actions (Gsa)mentioning
confidence: 99%
“…exponential time, exponential space, doublyexponential time) for the fully observable, no observation, and partially observable case respectivelly for the criterion of deciding whether a 100% winning strategy exists 3 . With exponential horizon, the complexities decrease to EXP, NEXP, EXPSPACE respectively [11]. -With two players without random part, the problem of approximating the best winning probability that can be achieved regardless of the opponent strategy is undecidable [14] by reduction to the one-player randomized case above in the no observation case; the best complexity upper bounds for bounded horizon are 3EXP (for exponential horizon) and 2EXP (for polynomial horizon).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Unfortunately, obtaining the true H-horizon optimal value is often difficult, e.g., due to the large state space (see, e.g., [29] for a discussion of the complexity of solving finitehorizon MDPs). Motivated by this, we study an approximate receding horizon control that uses an approximate value function as an approximate solution of V * H−1 for some H < ∞.…”
Section: Receding Horizon Controlmentioning
confidence: 99%
“…In terms of computational complexity, optimally solving a finitehorizon Dec-POMDP is NEXP-Complete (Bernstein et al, 2002). In contrast, finite-horizon POMDPs are PSPACE-complete (Mundhenk, Goldsmith, Lusena, & Allender, 2000), a strictly lower complexity class that highlights the difficulty of solving Dec-POMDPs.…”
Section: Introductionmentioning
confidence: 99%