The paper develops efficient and general stochastic approximation (SA) methods for improving the operation of parametrized systems of either the continuousor discrete-event dynamical systems types and which are of interest over a long time period. For example, one might wish to optimize or improve the stationary (or average cost per unit time) performance by adjusting the systems parameters. The number of applications and the associated literature are increasing at a rapid rate. This is partly due to the increasing activity in computing pathwise derivatives and adapting them to the average-cost problem. Although the original motivation and the examples come from an interest in the infinite-horizon problem, the techniques and results are of general applicability in SA. We present an updating and review of powerful ordinary differential equation-type methods, in a fairly general context, and based on weak convergence ideas. The results and proof techniques are applicable to a wide variety of applications. Exploiting the full potential of these ideas can greatly simplify and extend much current work. Their breadth as well as the relative ease of using the basic ideas are illustrated in detail via typical examples drawn from discrete-event dynamical systems, piecewise deterministic dynamical systems, and a stochastic differential equations model. In these particular illustrations, we use either infinitesimal perturbation analysis-type estimators, mean square derivative-type estimators, or finite-difference type estimators. Markov and non-Markov models are discussed. The algorithms for distributed/asynchronous updating as well as the fully synchronous schemes are developed. Key words, stochastic approximation, ordinary differential equation method, weak convergence, recursive optimization, Monte Carlo optimization, discrete-event dynamical systems, piecewise deterministic dynamical systems, stationary cost problems AMS subject classifications. 62L20, 93C40, 93E25, 90B25 1. Introduction. The paper is concerned with efficient and general stochastic approximation (SA) methods for parametrized systems of either continuous or disctrete event dynamical systems that are of interest over a long time period. For example, one might wish to optimize or improve the stationary (or average cost per unit time) performance by adjusting the systems parameters. The number of applications and the associated literature are increasing at a rapid rate. Although the motivation and examples come from an interest in this infinite-horizon problem, the techniques and results are of general applicability in SA. Basic techniques for such problems have appeared in [2, 22, 27]. These techniques are still fundamental for applications to to the general problems of current interest. Exploiting their full potential can greatly simplify and extend much current work. We present a full development of the basic ideas in [22, 27] and related works in a more general context, with the particular goal of illustrating their breadth as well as the relative ease ...
This paper addresses the problem of sensitivity analysis for finite-horizon performance measures of general Markov chains. We derive closed-form expressions and associated unbiased gradient estimators for the derivatives of finite products of Markov kernels by measure-valued differentiation (MVD). In the MVD setting, the derivatives of Markov kernels, called D-derivatives, are defined with respect to a class of performance functions D such that, for any performance measure g ∈ D, the derivative of the integral of g with respect to the one-step transition probability of the Markov chain exists. The MVD approach (i) yields results that can be applied to performance functions out of a predefined class, (ii) allows for a product rule of differentiation, that is, analyzing the derivative of the transition kernel immediately yields finite-horizon results, (iii) provides an operator language approach to the differentiation of Markov chains and (iv) clearly identifies the trade-off between the generality of the performance classes that can be analyzed and the generality of the classes of measures (Markov kernels). The D-derivative of a measure can be interpreted in terms of various (unbiased) gradient estimators and the product rule for D-differentiation yields a product-rule for various gradient estimators.
This study concerns a generic model-free stochastic optimization problem requiring the minimization of a risk function defined on a given bounded domain in a Euclidean space. Smoothness assumptions regarding the risk function are hypothesized, and members of the underlying space of probabilities are presumed subject to a large deviation principle; however, the risk function may well be nonconvex and multimodal. A general approach to finding the risk minimizer on the basis of decision/observation pairs is proposed. It consists of repeatedly observing pairs over a collection of design points. Principles are derived for choosing the number of these design points on the basis of an observation budget, and for allocating the observations between these points in both prescheduled and adaptive settings. On the basis of these principles, large-deviation type bounds of the minimizer in terms of sample size are established.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.