2012
DOI: 10.1109/tac.2012.2186176
|View full text |Cite
|
Sign up to set email alerts
|

Mean Field for Markov Decision Processes: From Discrete to Continuous Optimization

Abstract: We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating sol… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
85
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 83 publications
(87 citation statements)
references
References 29 publications
2
85
0
Order By: Relevance
“…The closest to our setting seems to be the recent work (Gast, Gaujal, & Le Boudec, 2010), which is devoted to a convergence result similar to our Theorem 2. However, in (Gast, Gaujal, & Le Boudec, 2010) a continuous time mean-field control model is obtained as a limit of discrete-time models, and we work directly with continuous time models.…”
Section: Conclusion and Bibliographical Commentsmentioning
confidence: 99%
See 2 more Smart Citations
“…The closest to our setting seems to be the recent work (Gast, Gaujal, & Le Boudec, 2010), which is devoted to a convergence result similar to our Theorem 2. However, in (Gast, Gaujal, & Le Boudec, 2010) a continuous time mean-field control model is obtained as a limit of discrete-time models, and we work directly with continuous time models.…”
Section: Conclusion and Bibliographical Commentsmentioning
confidence: 99%
“…(Andersson & Djehiche, 2003;Buckdahn, B. Djehiche, J. Li, & S. Peng, 2009) and references therein, for diffusion based models, and (Le Boudec, McDonald, & Mundinger, 2007;Gast & B. Gaujal, 2009;Gast, Gaujal, & Le Boudec, 2010;Bordenave, McDonald, & Proutiere, 2007;Benaïm & Le Boudec, 2008;Milutinovic & Lima, 2006) for discrete models, more engineering application oriented. In these papers one can find various concrete applications (from robot swamps to transportation theory and networks), which are also relevant to the mathematical models discussed in the present paper.…”
Section: Conclusion and Bibliographical Commentsmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore this paper is closer in spirit to stochastic approximation theory (Benaïm, 1998). While writing this paper I have learned from the paper by Gast et al (2012). They establish a limit result for a finite-horizon Markov decision process converging to a deterministic optimal control problem.…”
Section: Introductionmentioning
confidence: 92%
“…First I study infinite horizon problems with discounting. Second, my proof techniques are based on dynamic programming and viscosity solution techniques, whereas Gast et al (2012) rely on ideas from stochastic approximation theory. Before developing the general analysis of the problem, let me introduce some concrete examples to which the limit results apply.…”
Section: Introductionmentioning
confidence: 99%