Linear Programming and Constrained Average Optimality for General Continuous-Time Markov Decision Processes in History-Dependent Policies

Guo, Xianping; Huang, Yonghui; Song, Xinyuan

doi:10.1137/100805169

Cited by 31 publications

(44 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…To precisely define the optimality criterion, we need to introduce the concept of a policy, which is a generalization of the policies (on Borel measurability) in [9], [12], [18], [20], and [21] to the universal measurability.…”

Section: The Optimal Control Problemsmentioning

confidence: 99%

“…the survey [11], the monographs [8], [24], the recent works [9], [12], [20], [21], and [25], and the extensive references therein. As is well known, the commonly used optimality criteria in CTMDPs are the expected discounted, average, and the finite-horizon.…”

Section: Introductionmentioning

confidence: 99%

“…As is well known, the commonly used optimality criteria in CTMDPs are the expected discounted, average, and the finite-horizon. The former two criteria are on the infinite (time-) horizon case, and have been well studied; see, [3], [4], [7], [8], [9], [11], [16], [20], [21], [23], [24], and [26] for the infinite-horizon expected discounted criterion and [8], [10], [11], [12], [17], and [24] for the long run expected average criterion. In this paper we focus on the finite-horizon criterion for CTMDPs, thus we shall not pinpoint the earlier literature on the average and discounted CTMDPs with an infinite horizon, and give emphasis to those on finite-horizon CTMDPs.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates

Guo¹,

Huang²,

Huang³

2015

Adv. Appl. Probab.

Self Cite

View full text Add to dashboard Cite

In this paper we focus on the finite-horizon optimality for denumerable continuous-time Markov decision processes, in which the transition and reward/cost rates are allowed to be unbounded, and the optimality is over the class of all randomized history-dependent policies. Under mild reasonable conditions, we first establish the existence of a solution to the finite-horizon optimality equation by designing a technique of approximations from the bounded transition rates to unbounded ones. Then we prove the existence of ε(≥ 0)-optimal Markov policies and verify that the value function is the unique solution to the optimality equation by establishing the analog of the Itô-Dynkin formula. Finally, we provide an example in which the transition rates and the value function are all unbounded and, thus, obtain solutions to some of the unsolved problems by Yushkevich (1978).

show abstract

Section: The Optimal Control Problemsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates

Guo¹,

Huang²,

Huang³

2015

Adv. Appl. Probab.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Therefore, we can refer to Lemma 9 for an optimal solution v opt of problem (16) with f given by (17) such that v opt = N l=0 b l v l ex , where, for each l = 0, 1, . .…”

Section: {S Aâ(x) Q(dymentioning

confidence: 99%

“…; see the examples in the monographs [13] and [25]. Two standard performance measures of a CTMDP are the (expected) long-run average costs [12], [14], [17], [24], [34], [39] and the (expected) total discounted costs [15], [27], [30]- [32]. The long-run average criteria are not appropriate for CTMDPs with transient behavior because in that case the long-run average costs will be zero for each policy.…”

Section: Introductionmentioning

confidence: 99%

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Guo¹,

Vykertas

Zhang

2013

Adv. Appl. Probab.

Self Cite

View full text Add to dashboard Cite

In this paper we study absorbing continuous-time Markov decision processes in Polish state spaces with unbounded transition and cost rates, and history-dependent policies. The performance measure is the expected total undiscounted costs. For the unconstrained problem, we show the existence of a deterministic stationary optimal policy, whereas, for the constrained problems with N constraints, we show the existence of a mixed stationary optimal policy, where the mixture is over no more than N + 1 deterministic stationary policies. Furthermore, the strong duality result is obtained for the associated linear programs.

show abstract

Risk-sensitive discounted cost criterion for continuous-time Markov decision processes on a general state space

Pal

Golui

2022

Math Meth Oper Res

View full text Add to dashboard Cite

Linear Programming and Constrained Average Optimality for General Continuous-Time Markov Decision Processes in History-Dependent Policies

Cited by 31 publications

References 26 publications

Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates

Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Risk-sensitive discounted cost criterion for continuous-time Markov decision processes on a general state space

Contact Info

Product

Resources

About