1989
DOI: 10.1287/opre.37.4.626
|View full text |Cite
|
Sign up to set email alerts
|

Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs

Abstract: We deal with infinite state Markov decision processes with unbounded costs. Three simple conditions, based on the optimal discounted value function, guarantee the existence of an expected average cost optimal stationary policy. Sufficient conditions are the existence of a distinguished state of smallest discounted value and a single stationary policy inducing an irreducible, ergodic Markov chain for which the average cost of a first passage from any state to the distinguished state is finite. A result to verif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

3
173
0
1

Year Published

1993
1993
2011
2011

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 192 publications
(177 citation statements)
references
References 18 publications
3
173
0
1
Order By: Relevance
“…Moreover, as discussed above, the model is equivalent to a discrete time model by considering the system state at transition epochs. For the discrete model general results on average cost Markov decision problems (see, e.g., Sennott, 1989) assure the existence of a stationary average cost optimal policy. Average cost optimality of an (s, Q)-policy follows since any stationary policy based on the inventory position in the above model is equal to an (s, Q)-policy up to a transient phase.…”
Section: Proofmentioning
confidence: 99%
See 3 more Smart Citations
“…Moreover, as discussed above, the model is equivalent to a discrete time model by considering the system state at transition epochs. For the discrete model general results on average cost Markov decision problems (see, e.g., Sennott, 1989) assure the existence of a stationary average cost optimal policy. Average cost optimality of an (s, Q)-policy follows since any stationary policy based on the inventory position in the above model is equal to an (s, Q)-policy up to a transient phase.…”
Section: Proofmentioning
confidence: 99%
“…To this end, we exploit general theory of Markov decision processes that has been well developed in the past two decades. In particular, we make use of Sennott's results on infinite state Markov decision processes with unbounded costs (Sennott, 1989).…”
Section: General Demand Case: Optimal Policy Structurementioning
confidence: 99%
See 2 more Smart Citations
“…Techniques for deriving the former from the latter are now well developed. Recent results specifically motivated by control of queues may be found in Borkar [8][9][10], Weber and Stidham [67], Cavazos-Cadena [12,13], Sennott [54,55]. For a survey, see Arapostathis et al [2].…”
Section: Introductionmentioning
confidence: 99%