Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs

Pereira, Marcus A.; Wang, Ziyi; Theodorou, Evangelos A.

doi:10.15607/rss.2019.xv.070

Cited by 11 publications

(23 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Henceforth the stochastic control problem (12) subject to the dynamics (13), is referred to simply as problem (12). Put in words, the problem undertaken is to find a deterministic open loop optimal control for all s ∈ [t, T ] to minimize the cost (10) over [t, T ] given the PDE dynamics for p(s, x).…”

Section: Problem Statementmentioning

confidence: 99%

“…The MFG case is further complicated due to the fully coupled nature of the HJB-FP system ( [7], [8], [9]). The first [10] and second ( [11], [12]) order forward-backward SDE (FBSDE) [1] framework has been applied to obtain algorithms for optimal control of dynamics with nonlinear drift and state multiplicative noise, but not in the case of control multiplicative Gaussian or the general case of non-Gaussian excitation [13].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Open-loop Deterministic Density Control of Marked Jump Diffusions

Bakshi,

Theodorou

2020

Preprint

View full text Add to dashboard Cite

The standard practice in modeling dynamics and optimal control of a large population, ensemble, multi-agent system represented by it's continuum density, is to model individual decision making using local feedback information. In comparison to a closed-loop optimal control scheme, an open-loop strategy, in which a centralized controller broadcasts identical control signals to the ensemble of agents, mitigates the computational and infrastructure requirements for such systems. This work considers the open-loop, deterministic and optimal control synthesis for the density control of agents governed by marked jump diffusion stochastic diffusion equations. The density evolves according to a forward-intime Chapman-Kolmogorov partial integro-differential equation and the necessary optimality conditions are obtained using the infinite dimensional minimum principle (IDMP). We establish the relationship between the IDMP and the dynamic programming principle as well as the IDMP and stochastic dynamic programming for the synthesized controller. Using the linear Feynman-Kac lemma, a sampling-based algorithm to compute the control is presented and demonstrated for agent dynamics with non-affine and nonlinear drift as well as noise terms.

show abstract

Section: Problem Statementmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Open-loop Deterministic Density Control of Marked Jump Diffusions

Bakshi,

Theodorou

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…The idea behind deep FBSDEs is to find a numerical approximation to the solution of the HJB equation, which is the value function given with respect to a cost function. Following the derivation in [15], the optimal control can then be calculated as a function of the partial derivative of the value function with respect to the state. We can formulate this problem as a FBSDE and solve it using a neural network which has the benefit of resolving compounding-errors.…”

Section: Introductionmentioning

confidence: 99%

“…Such a method has gained traction recently. [15] first proposed the deep FBSDE algorithm using LSTMs. On top of the vanilla LSTM-based FBSDE formulation, [16] showed how to handle systems with control multiplicative noise and [17] solved problems with unknown noise distribution.…”

Section: Introductionmentioning

confidence: 99%

Learning Locomotion Controllers for Walking Using Deep FBSDE

Dai,

Surabhi,

Krishnamurthy

et al. 2021

Preprint

View full text Add to dashboard Cite

In this paper, we propose a deep forwardbackward stochastic differential equation (FBSDE) based control algorithm for locomotion tasks. We also include state constraints in the FBSDE formulation to impose stable walking solutions or other constraints that one may want to consider (e.g., energy). Our approach utilizes a deep neural network (i.e., LSTM) to solve, in general, high-dimensional Hamilton-Jacobi-Bellman (HJB) equation resulting from the stated optimal control problem. As compared to traditional methods, our proposed method provides a higher computational efficiency in real-time; thus yielding higher frequency implementation of the closed-loop controllers. The efficacy of our approach is shown on a linear inverted pendulum model (LIPM) for walking. Even though we are deploying a simplified model of walking, the methodology is applicable to generalized and complex models for walking and other control/optimization tasks in robotic systems. Simulation studies have been provided to show the effectiveness of the proposed methodology.

show abstract

“…The multiplicative noise severely affects stability and robustness [12] and the optimal control formulation in this case is non-standard due to the non-Gaussianity of the dynamics [9], and most works addressing either aspect in the linear-quadratic regime. In the case of nonlinear dynamics with multiplicative noise, differential dynamic programming [11] and forward backward stochastic differential equations using first [13] and second order schemes ( [14], [15]) have been used to synthesize algorithms to compute the control.…”

Section: Introductionmentioning

confidence: 99%

Stabilizing Optimal Density Control of Nonlinear Agents with Multiplicative Noise

Bakshi¹,

Theodorou²,

Grover³

2020

Preprint

View full text Add to dashboard Cite

Control of continuous time dynamics with multiplicative noise is a classic topic in stochastic optimal control. This work addresses the problem of designing infinite horizon optimal controls with stability guarantees for large populations of identical, non-cooperative and non-networked agents with multi-dimensional and nonlinear stochastic dynamics excited by multiplicative noise. We provide constraints on the state and control cost functions which guarantee stability of the closed-loop system under the action of the individual optimal controls, for agent dynamics belonging to the the class of reversible diffusion processes. A condition relating the state-dependent control cost and volatility is introduced to prove the stability of the equilibrium density. This condition is a special case of the constraint required to use the path integral Feynman-Kac formula for computing the control. We investigate the connection between the stabilizing optimal control and the path integral formalism, leading us to a control law formulation expressed exclusively in terms of the desired equilibrium density.

show abstract

Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs

Cited by 11 publications

References 22 publications

Open-loop Deterministic Density Control of Marked Jump Diffusions

Open-loop Deterministic Density Control of Marked Jump Diffusions

Learning Locomotion Controllers for Walking Using Deep FBSDE

Stabilizing Optimal Density Control of Nonlinear Agents with Multiplicative Noise

Contact Info

Product

Resources

About