The curse of dimensionality gives rise to prohibitive computational requirements that render infeasible the exact solution of large-scale stochastic control problems. We study an efficient method based on linear programming for approximating solutions to such problems. The approach "fits" a linear combination of pre-selected basis functions to the dynamic programming cost-to-go function. We develop error bounds that offer performance guarantees and also guide the selection of both basis functions and "state-relevance weights" that influence quality of the approximation. Experimental results in the domain of queueing network control provide empirical support for the methodology. (Dynamic programming/optimal control: approximations/large-scale problems. Queues, algorithms: control of queueing networks.) 850 0030-364X/03/5106-0850 1526-5463 electronic ISSN J * − r 1 c = x∈ c x J * x − r x = c T J * − c T r and maximizing c T r is therefore equivalent to minimizing J * − r 1 c .
In the linear programming approach to approximate dynamic programming, one tries to solve a certain linear program-the ALP-that has a relatively small number K of variables but an intractable number M of constraints. In this paper, we study a scheme that samples and imposes a subset of m M constraints. A natural question that arises in this context is: How must m scale with respect to K and M in order to ensure that the resulting approximation is almost as good as one given by exact solution of the ALP? We show that, given an idealized sampling distribution and appropriate constraints on the K variables, m can be chosen independently of M and need grow only as a polynomial in K. We interpret this result in a context involving controlled queueing networks. 1. Introduction. Due to the "curse of dimensionality," Markov decision processes typically have a prohibitively large number of states, rendering exact dynamic programming methods intractable and calling for the development of approximation techniques. This paper represents a step in the development of a linear programming approach to approximate dynamic programming (de Farias and Van Roy 2003;Schweitzer and Seidmann 1985; Zin 1993, 1997). This approach relies on solving a linear program that generally has few variables but an intractable number of constraints. In this paper, we propose and analyze a constraint sampling method for approximating the solution to this linear program. We begin in this section by discussing our working problem formulation, the linear programming approach, constraint sampling, results of our analysis, and related literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.