The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach

Dufour, François; Horiguchi, Masayuki; Piunovskiy, Alexey

doi:10.1017/s0001867800005875

Cited by 15 publications

(67 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…is called to be of finite value. It can be shown as in [9,Thm.3 Appendix 4] or [12]. It can be shown as in the proof of [9, Thm.3.3, Cor.3.1] that the stationary strategy π µ defined by…”

Section: Mdp Approach and The First Linear Programming Methodsmentioning

confidence: 96%

“…According to [9,Thm.4.1], for a constrained total cost MDP with Borel state space X ∆ , Borel action space B, transition probability Q, and positive cost functions { C j } J j=0 , if the model is semicontinuous, then, provided that there exists a feasible strategy with finite value, there is an optimal stationary strategy. Here the model is called semicontinuous if its action space B is compact, { C j } J j=0 are all lower semicontinuous, and Q is continuous, i.e., for each bounded continuous function f on X ∆ ,…”

Section: Mdp Approach and The First Linear Programming Methodsmentioning

confidence: 99%

“…Throughout this paper, we fix this measurable mapping f * . It can be shown as in [9,Prop.3.2] that Q(V c |x, f * (x)) = 0 for each x ∈ V.…”

Section: Mdp Approach and The First Linear Programming Methodsmentioning

confidence: 99%

“…The first possible definition comes from the occupation measures of the MDP corresponding to the impulse control problem. Based on results in the recent work [9] for MDPs, we establish a linear programming approach for the impulse control problem, and obtain, under very general and natural conditions, the existence of an optimal (possibly randomized) stationary policy. This linear programming approach will be referred to as the first because it is based on the first definition of occupation measures.…”

Section: Introductionmentioning

confidence: 99%

“…This linear programming approach will be referred to as the first because it is based on the first definition of occupation measures. Here the problematic issue is that while the MDP model in [9] is required to be semicontinuous, the induced MDP from our impulse control problem does not have a continuous transition kernel. To get over this difficulty, we use the following trick: the state space can be extended in such a way that one can introduce a suitable topology on it, with respect to which, the resulting MDP model becomes semicontinuous.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Optimal Impulse Control of Dynamical Systems

Piunovskiy¹,

Plakhov²,

Torres³

et al. 2019

SIAM J. Control Optim.

View full text Add to dashboard Cite

This paper considers a constrained optimal impulse control problem of dynamical systems generated by a flow. Under quite general and natural conditions, we prove the existence of an optimal stationary policy. This is done by making use of the tools of Markov decision processes. Two linear programming approaches are established and justified. In absence of constraints, we show that these two linear programming approaches are dual to the dynamic programming method with the optimality equations in the integral and differential form, respectively.

show abstract

“…is called to be of finite value. It can be shown as in [9,Thm.3 Appendix 4] or [12]. It can be shown as in the proof of [9, Thm.3.3, Cor.3.1] that the stationary strategy π µ defined by…”

Section: Mdp Approach and The First Linear Programming Methodsmentioning

confidence: 96%

Section: Mdp Approach and The First Linear Programming Methodsmentioning

confidence: 99%

“…Throughout this paper, we fix this measurable mapping f * . It can be shown as in [9,Prop.3.2] that Q(V c |x, f * (x)) = 0 for each x ∈ V.…”

Section: Mdp Approach and The First Linear Programming Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Optimal Impulse Control of Dynamical Systems

Piunovskiy¹,

Plakhov²,

Torres³

et al. 2019

SIAM J. Control Optim.

View full text Add to dashboard Cite

show abstract

Impulsive Control for Continuous-Time Markov Decision Processes: A Linear Programming Approach

Dufour

Piunovskiy

2015

Appl Math Optim

Self Cite

View full text Add to dashboard Cite

show abstract

On the Continuity of the Projection Mapping from Strategic Measures to Occupation Measures in Absorbing Markov Decision Processes

Piunovskiy,

Zhang

2024

Appl Math Optim

View full text Add to dashboard Cite

In this paper, we prove the following assertion for an absorbing Markov decision process (MDP) with the given initial distribution, which is also assumed to be semi-continuous: the continuity of the projection mapping from the space of strategic measures to the space of occupation measures, both endowed with their weak topologies, is equivalent to the MDP model being uniformly absorbing. An example demonstrates, among other interesting scenarios, that for an absorbing (but not uniformly absorbing) semi-continuous MDP with the given initial distribution, the space of occupation measures can fail to be compact in the weak topology.

show abstract

The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach

Cited by 15 publications

References 9 publications

Optimal Impulse Control of Dynamical Systems

Optimal Impulse Control of Dynamical Systems

Impulsive Control for Continuous-Time Markov Decision Processes: A Linear Programming Approach

On the Continuity of the Projection Mapping from Strategic Measures to Occupation Measures in Absorbing Markov Decision Processes

Contact Info

Product

Resources

About