1998
DOI: 10.1007/s001860050035
|View full text |Cite
|
Sign up to set email alerts
|

Constrained Markov decision processes with total cost criteria: Lagrangian approach and dual linear program

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
48
0
1

Year Published

2007
2007
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 54 publications
(49 citation statements)
references
References 53 publications
0
48
0
1
Order By: Relevance
“…Altman [90,91] 研究了 DTMDP 受约束的总费用准则; 文献 [92,93] 对 DTMDP 受约束的平均准则进行 了讨论; 文献 [20,94] 处理了 CTMDP 受约束的折扣准则; Guo 等人 [21] 考虑了 CTMDP 受约束的平 均准则.…”
Section: 受约问题unclassified
“…Altman [90,91] 研究了 DTMDP 受约束的总费用准则; 文献 [92,93] 对 DTMDP 受约束的平均准则进行 了讨论; 文献 [20,94] 处理了 CTMDP 受约束的折扣准则; Guo 等人 [21] 考虑了 CTMDP 受约束的平 均准则.…”
Section: 受约问题unclassified
“…For that reason, a number of researchers have proposed and utilized an alternative solution approach, which is based upon mathematical programming (Altman, 1998;Feinberg, 2000;Dolgov & Durfee, 2006). A procedure for formulating an MDP into a linear program (whose solution yields an optimal policy maximizing the total expected reward) is described below.…”
Section: Linear Programmingmentioning
confidence: 99%
“…Our third contribution lies in deriving key theoretical results establishing provable performance and behavior guarantees for the derived policies. Contracting or transient MDP models that use the expected total reward as the optimality criterion are commonplace in constrained MDPs since optimal stationary policies with regard to this criterion can always be found via mathematical programming in view of a well-established one-to-one correspondence between stationary policies and feasible solutions to such programs (Altman, 1998;Feinberg, 2000;Wu & Durfee, 2010;Petrik & Zilberstein, 2009). The notoriously more difficult and equally important expected average reward criterion is much less understood considering that such correspondence ceases to exist for general multichain MDPs.…”
Section: Introductionmentioning
confidence: 99%