2012
DOI: 10.1137/100805169
|View full text |Cite
|
Sign up to set email alerts
|

Linear Programming and Constrained Average Optimality for General Continuous-Time Markov Decision Processes in History-Dependent Policies

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
43
0
1

Year Published

2013
2013
2022
2022

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 31 publications
(44 citation statements)
references
References 26 publications
0
43
0
1
Order By: Relevance
“…To precisely define the optimality criterion, we need to introduce the concept of a policy, which is a generalization of the policies (on Borel measurability) in [9], [12], [18], [20], and [21] to the universal measurability.…”
Section: The Optimal Control Problemsmentioning
confidence: 99%
See 2 more Smart Citations
“…To precisely define the optimality criterion, we need to introduce the concept of a policy, which is a generalization of the policies (on Borel measurability) in [9], [12], [18], [20], and [21] to the universal measurability.…”
Section: The Optimal Control Problemsmentioning
confidence: 99%
“…the survey [11], the monographs [8], [24], the recent works [9], [12], [20], [21], and [25], and the extensive references therein. As is well known, the commonly used optimality criteria in CTMDPs are the expected discounted, average, and the finite-horizon.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, we can refer to Lemma 9 for an optimal solution v opt of problem (16) with f given by (17) such that v opt = N l=0 b l v l ex , where, for each l = 0, 1, . .…”
Section: {S Aâ(x) Q(dymentioning
confidence: 99%
“…; see the examples in the monographs [13] and [25]. Two standard performance measures of a CTMDP are the (expected) long-run average costs [12], [14], [17], [24], [34], [39] and the (expected) total discounted costs [15], [27], [30]- [32]. The long-run average criteria are not appropriate for CTMDPs with transient behavior because in that case the long-run average costs will be zero for each policy.…”
Section: Introductionmentioning
confidence: 99%