2002
DOI: 10.1287/opre.50.5.796.365
|View full text |Cite
|
Sign up to set email alerts
|

Structural Properties of Stochastic Dynamic Programs

Abstract: In Markov models of sequential decision processes, one is often interested in showing that the value function is monotonic, convex, and/or supermodular in the state variables. These kinds of results can be used to develop a qualitative understanding of the model and characterize how the results will change with changes in model parameters. In this paper we present several fundamental results for establishing these kinds of properties. The results are, in essence, "metatheorems" showing that the value functions… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
78
0

Year Published

2007
2007
2019
2019

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 95 publications
(79 citation statements)
references
References 16 publications
1
78
0
Order By: Relevance
“…3. See also Hopenhayn and Prescott (1992), Topkis (1998), and Smith and McCardle (2002) for other conditions that yield monotonicity of optimal solutions to dynamic programs. 4.…”
Section: Proofmentioning
confidence: 99%
“…3. See also Hopenhayn and Prescott (1992), Topkis (1998), and Smith and McCardle (2002) for other conditions that yield monotonicity of optimal solutions to dynamic programs. 4.…”
Section: Proofmentioning
confidence: 99%
“…For examples of such results, see Benjaafar and ElHafsi (2006), ElHafsi et al (2008), ElHafsi (2009), and Benjaafar et al (2011. See also Smith and McCardle (2002) for sufficient conditions ensuring convexity in a multivariate Markovian inventory model. However, the existence of counterexamples proves that convexity need not hold for our model (see Nadar et al 2014).…”
Section: Introductionmentioning
confidence: 99%
“…To propose the prototypical procedure of proving the existence of a monotonic optimal policy, we first define a P property as follows: We therefore propose an approach, similar to Proposition 5 in [18], as follows: (n) (x, a) = C(x, a) + β x ∈X P a xx V (n−1) (x ) has P property for all P property functions V (n−1) and n.…”
Section: Structured Properties Of Dynamic Programmingmentioning
confidence: 99%
“…In such a high dimensional MDP, the curse of dimensionality 1 becomes more evident [17]; the computation load grows quickly if the cardinality of any tuple in the state variable is large. To relieve the curse, one solution is to qualitatively understand the model and prove the existence of a monotonic optimal policy [18]. Then, a low complexity algorithm or a model-free learning method can be proposed, e.g., simultaneous perturbation stochastic approximation (SPSA) [19,20].…”
Section: Introductionmentioning
confidence: 99%