Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Guo, Xianping; Vykertas, Mantas; Zhang, Yi

doi:10.1239/aap/1370870127

Cited by 8 publications

(16 citation statements)

References 34 publications

(68 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This work is organized as follows: To focus on the development of compactification method in [17,21] for the optimal control problem from the setting of diffusion processes to that of CTMDPs, we consider in Section 2 the optimal control problem for classical CTMDPs without any random impact of the environment. The class of ψ-relaxed controls is a subset of classical admissible controls studied, for instance, in [12,13,15]. Therefore, our existence result of optimal ψ-relaxed control is a little stronger than the existence of classical admissible controls.…”

Section: Introductionmentioning

confidence: 89%

“…Continuous-time Markov decision processes (CTMDPs) have been extensively studied and widely applied in various application fields such as telecommunication, queueing systems, population processes, epidemiology, and so on. See, for instance, the monographs [12,26], the works [10,11,13,14,15,19,24,25] and references therein. As an illustrative example, we consider the controlled queueing systems.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

The existence of optimal control for continuous-time Markov decision processes in random environments

Shao,

Zhao

2019

Preprint

View full text Add to dashboard Cite

In this work, we investigate the optimal control problem for continuous-time Markov decision processes with the random impact of the environment. We provide conditions to show the existence of optimal controls under finite-horizon criteria. Under appropriate conditions, the value function is continuous and satisfies the dynamic programming principle. These results are established by introducing some restriction on the regularity of the optimal controls and by developing a new compactification method for continuous-time Markov decision processes, which is originally used to solve the optimal control problem for jump-diffusion processes.

show abstract

Section: Introductionmentioning

confidence: 89%

Section: Introductionmentioning

confidence: 99%

The existence of optimal control for continuous-time Markov decision processes in random environments

Shao,

Zhao

2019

Preprint

View full text Add to dashboard Cite

show abstract

“…On the other hand, it follows from Assumptions 3.1(ii) and 3.1(iii) that sup μ∈D μ(K) = sup η∈D K w dη < ∞. Thus, by [11,Lemma 7],D is relatively compact in (M + (K), σ (M + (K))).…”

Section: Q} and The Continuity Ofmentioning

confidence: 95%

“…(i) The occupation measures are in fact state-action frequencies. They are widely used in MDPs so as to transform a stochastic dynamic control problem to a static optimization problem; see [9], [11], [13], [15], [18], and [19]. In the next theorem we state some characterizations of the elements of D. To prove the second assertion of (ii), suppose that we have…”

Section: Occupation Measuresmentioning

confidence: 99%

See 1 more Smart Citation

Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria

Huang¹,

Lian

Guo³

2018

Adv. Appl. Probab.

Self Cite

View full text Add to dashboard Cite

In this paper we investigate risk-sensitive semi-Markov decision processes with a Borel state space, unbounded cost rates, and general utility functions. The performance criteria are several expected utilities of the total cost in a finite horizon. Our analysis is based on a type of finite-horizon occupation measure. We express the distribution of the finite-horizon cost in terms of the occupation measure for each policy, wherein the discount is not needed. For unconstrained and constrained problems, we establish the existence and computation of optimal policies. In particular, we develop a linear program and its dual program for the constrained problem and, moreover, establish the strong duality between the two programs. Finally, we provide two special cases of our results, one of which concerns the discrete-time model, and the other the chance-constrained problem.

show abstract

Discounted continuous-time Markov decision processes with unbounded rates and randomized history-dependent policies: the dynamic programming approach

Piunovskiy

Zhang

2013

4OR-Q J Oper Res

Self Cite

View full text Add to dashboard Cite

This paper deals with unconstrained discounted continuous-time Markov decision processes in Borel state and action spaces. Under some conditions imposed on the primitives, allowing unbounded transition rates and unbounded (from both above and below) cost rates, we show the regularity of the controlled process, which ensures the underlying models to be well defined. Then we develop the dynamic programming approach by showing that the Bellman equation is satisfied (by the optimal value). Finally, under some compactness-continuity conditions, we obtain the existence of a deterministic stationary optimal policy out of the class of randomized history-dependent policies. 1 It is a standard practice to use "CTMDPs" and "CTMDPs optimization problems" interchangeably.CTMDPs allowing transition rates to be not uniformly bounded. However, the conditions assumed therein are difficult for verifications, as some of them are not directly imposed on the primitives but on the transition probability functions. Later on, there have been developments in the direction of only imposing conditions on the primitives, while still allowing unbounded transition rates, see [8,25] and the relevant chapters in the monograph [9]. It should be noted that all of the aforementioned works allowing unbounded transition rates are restricted to the class of randomized Markov policies. As a fact of matter, according to [7], the study of CTMDPs with the combination of randomized history-dependent policies and unbounded transition rates had been an over thirty year-old open problem. To our best knowledge, the first successful treatment for such CTMDPs is given by [10], where the state space is countable.In the present paper, we consider a more general case by allowing randomized history-dependent policies, unbounded transition rates and Borel state and action spaces into consideration, while all our conditions are imposed on the primitives. The cost rates being allowed to be unbouned (both from below and above) are more general than those considered in [4,5,6,7,8,9,10] and many others, too.The main contributions of the present paper are triple-folded. Under the imposed conditions on the primitives, we firstly show the regularity of the controlled process under any given randomized history-dependent policy, which allows a formal optimization problem statement. Then we develop the dynamic programming approach, by showing that the optimal value of the problem satisfies the corresponding Bellman equation. Finally, we establish the existence of a deterministic stationary optimal policy. In relation to the most recent literature on this topic, the present work refines [8] by considering randomized history-dependent policies 2 , and extends [10] to the case of Borel state spaces and more general cost rates.The rest of this paper is organized as follows. In Section 2, we briefly describe Kitaev's construction for CTMDPs, and present some preliminary results including the regularity, Kolmogorov's forward equations and Dynkin's formula for the controlled processe...

show abstract

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Cited by 8 publications

References 34 publications

The existence of optimal control for continuous-time Markov decision processes in random environments

The existence of optimal control for continuous-time Markov decision processes in random environments

Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria

Discounted continuous-time Markov decision processes with unbounded rates and randomized history-dependent policies: the dynamic programming approach

Contact Info

Product

Resources

About