2003
DOI: 10.1287/opre.51.6.850.24925
|View full text |Cite
|
Sign up to set email alerts
|

The Linear Programming Approach to Approximate Dynamic Programming

Abstract: The curse of dimensionality gives rise to prohibitive computational requirements that render infeasible the exact solution of large-scale stochastic control problems. We study an efficient method based on linear programming for approximating solutions to such problems. The approach "fits" a linear combination of pre-selected basis functions to the dynamic programming cost-to-go function. We develop error bounds that offer performance guarantees and also guide the selection of both basis functions and "state-re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

8
527
1
6

Year Published

2003
2003
2013
2013

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 550 publications
(542 citation statements)
references
References 25 publications
8
527
1
6
Order By: Relevance
“…Furthermore, there is, in general, no guarantee as to the quality of the greedy policy generated from the approximation Hw. However, the recent work of de Farias and Van Roy (2001a) provides some analysis of the error relative to that of the best possible approximation in the subspace, and some guidance as to selecting α so as to improve the quality of the approximation. In particular, their analysis shows that this LP provides the best approximation Hw * of the optimal value function V * in a weighted L 1 sense subject to the constraint that Hw * ≥ T * Hw * , where the weights in the L 1 norm are the state relevance weights α.…”
Section: Approximate Linear Programmentioning
confidence: 99%
“…Furthermore, there is, in general, no guarantee as to the quality of the greedy policy generated from the approximation Hw. However, the recent work of de Farias and Van Roy (2001a) provides some analysis of the error relative to that of the best possible approximation in the subspace, and some guidance as to selecting α so as to improve the quality of the approximation. In particular, their analysis shows that this LP provides the best approximation Hw * of the optimal value function V * in a weighted L 1 sense subject to the constraint that Hw * ≥ T * Hw * , where the weights in the L 1 norm are the state relevance weights α.…”
Section: Approximate Linear Programmentioning
confidence: 99%
“…In many applications of DP, the number of states and actions available in each state are large; consequently, the computational effort required to compute the optimal policy for a DP can be overwhelming -Bellman's "curse of dimensionality". For this reason, considerable recent research effort has focused on developing algorithms that compute an approximately optimal policy efficiently (Bertsekas and Tsitsiklis, 1996;de Farias and Van Roy, 2002).…”
Section: Introductionmentioning
confidence: 99%
“…Due to the "curse of dimensionality," Markov decision processes typically have a prohibitively large number of states, rendering exact dynamic programming methods intractable and calling for the development of approximation techniques. This paper represents a step in the development of a linear programming approach to approximate dynamic programming (de Farias and Van Roy 2003;Schweitzer and Seidmann 1985;Zin 1993, 1997). This approach relies on solving a linear program that generally has few variables but an intractable number of constraints.…”
mentioning
confidence: 99%