2005
DOI: 10.1287/opre.1050.0216
|View full text |Cite
|
Sign up to set email alerts
|

Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Abstract: Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems.We consider a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets. We show t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

5
688
0
5

Year Published

2005
2005
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 628 publications
(737 citation statements)
references
References 20 publications
5
688
0
5
Order By: Relevance
“…In Section 4 we describe three families of sets of conditional measures that are based on the confidence regions, and show that the computational effort required to solve the robust DP corresponding to these sets is only modestly higher than that required to solve the nonrobust counterpart. The results in this section, although independently obtained, are not new and were first obtained by Nilim and El Ghaoui (2002). In Section 5 we provide basic examples and computational results.…”
Section: Introductionmentioning
confidence: 87%
See 2 more Smart Citations
“…In Section 4 we describe three families of sets of conditional measures that are based on the confidence regions, and show that the computational effort required to solve the robust DP corresponding to these sets is only modestly higher than that required to solve the nonrobust counterpart. The results in this section, although independently obtained, are not new and were first obtained by Nilim and El Ghaoui (2002). In Section 5 we provide basic examples and computational results.…”
Section: Introductionmentioning
confidence: 87%
“…In current practice, these errors are ignored and the optimal policy is computed assuming that the estimate is, indeed, the true transition probability. The DP optimal policy is quite sensitive to perturbations in the transition probability and ignoring the estimation errors can lead to serious degradation in performance (Nilim and El Ghaoui, 2002;Tsitsiklis et al, 2002). Degradation in performance due to estimation errors in parameters has also been observed in other contexts (Ben-Tal and Nemirovski, 1997;Goldfarb and Iyengar, 2003).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The robust optimal policy for an uncertain MDP can be computed using various methods [10], [16], [15].…”
Section: A Two-step Solution For Umdpmentioning
confidence: 99%
“…However, none of these methods is robust in the presence of modeling uncertainty. On the other hand, for MDPs with uncertain parameters, robust MDPs have been extensively studied [10], [8], [15]. Recently, robust control of MDPs has been extended to handle expressive temporal logic constraints [16], [3], [14].…”
Section: Introductionmentioning
confidence: 99%