We propose the policy graph as a structured way of formulating a general class of multistage stochastic programming problems in a way that leads to a natural decomposition. We also propose an extension to the stochastic dual dynamic programming algorithm to solve a subset of problems formulated as a policy graph. This subset includes discrete-time, convex, infinite-horizon, multistage stochastic programming problems with continuous state and control variables. To demonstrate the utility of our algorithm, we solve an existing multistage stochastic programming problem from the literature based on pastoral dairy farming. We show that the finite-horizon model in the literature suffers from end-of-horizon effects, which we are able to overcome with an infinite-horizon model.
KEYWORDSinfinite horizon, multistage, policy graph, stochastic programming Networks. 2020;76:3-23. wileyonlinelibrary.com/journal/net © 2020 Wiley Periodicals, Inc. 3 4 DOWSONto solve a subset of problems formulated as a policy graph. Finally, in Section 5 we demonstrate the utility of considering the infinite-horizon formulation of a problem arising from pastoral agriculture.
TERMINOLOGY
Basic definitionsFirst, let us define the stage in multistage.
Definition.A stage is a discrete moment in time in which the agent chooses a decision and any uncertainty is revealed.Therefore, multistage refers to a problem that can be decomposed into a sequence of stages. This requires the assumption that time can be discretized. Second, the stochastic component of multistage stochastic programming refers to problems with uncertainty. In this paper, we differentiate between two types of uncertainty; the first of which we term a noise. (We shall describe the second type in the next section, but it relates to how the problem transitions between stages.)
Definition.A noise is a stagewise-independent random variable in stage t.In stage t, we denote a single observation of the noise with the lowercase t , and the sample space from which it is drawn by the uppercase Ω t . Ω t can be continuous or discrete, although in this paper we only consider the discrete case. Furthermore, we use the term stagewise-independent to refer to the fact that the distribution of the noise in stage t is independent of the noise in other time periods 1, 2, …, t − 1, t + 1, t + 2, ….Next we define a state, modifying slightly the definition from Powell [40].
Definition.A state is a function of history that captures all the information we need to model a system from some point in time onward.Expressed a different way, a state is the smallest piece of information that is necessary to pass between stage t and t + 1 so that the optimal decision-making in stage t + 1 onward can be made independently from the decisions that were made in stages 1 to t. Each dimension of the state is represented by a state variable. State variables can be continuous or discrete.We denote the state variable at the start of stage t by the lowercase x t . We refer to x t as the incoming state variable. Then, during the ...