2013
DOI: 10.1016/j.orl.2013.02.002
|View full text |Cite
|
Sign up to set email alerts
|

Strong polynomiality of policy iterations for average-cost MDPs modeling replacement and maintenance problems

Abstract: a b s t r a c tThis note considers an average-cost Markov Decision Process (MDP) with finite state and action sets and satisfying the additional condition that there is a state to which the system jumps from any state and under any action with a positive probability. The main result is that the policy iteration algorithm is strongly polynomial for such MDPs, which are often used to model replacement and maintenance problems.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 10 publications
0
6
0
Order By: Relevance
“…for all x ∈X, there exists a stationary optimal policy for the discounted problem, and a stationary policy is optimal for this problem if and only if (8) holds for all x ∈ X. Corollary 7. Suppose Assumption HT holds with a state ℓ ∈ X that is isolated from X and Assumption WC holds.…”
Section: Hv-ag Transformationmentioning
confidence: 99%
See 1 more Smart Citation
“…for all x ∈X, there exists a stationary optimal policy for the discounted problem, and a stationary policy is optimal for this problem if and only if (8) holds for all x ∈ X. Corollary 7. Suppose Assumption HT holds with a state ℓ ∈ X that is isolated from X and Assumption WC holds.…”
Section: Hv-ag Transformationmentioning
confidence: 99%
“…One advantage of this approach is that it can be used to apply methods and algorithms developed for discounted MDPs to undiscounted MDPs. [1,8,9].…”
Section: Introductionmentioning
confidence: 99%
“…Typically, average optimization is a more difficult problem than discounted optimization [7]. PI can need an exponential number of iterations under averageoptimality for stochastic MDPs in the general case [6].…”
Section: Dynamic Programmingmentioning
confidence: 99%
“…A direct reduction of average‐cost MDPs to discounted ones, which yields sufficient conditions for the existence of stationary average‐cost optimal policies, was established by Ross for MDPs with Borel state space, finite action sets, bounded costs, and a state to which the process will transition from any state under any action with probability at least α > 0 . This reduction and Ye's results were used by Feinberg and Huang to obtain iteration bounds for average‐cost policy iterations. Gubenko and Štatland showed that a reduction is also possible for MDPs with Borel state space, bounded costs, and compact action sets, if a “minorization” condition, which generalizes Ross's assumption, is satisfied; see also Dynkin and Yushkevich , Chapter 7, §10].…”
Section: Introductionmentioning
confidence: 99%