Deep Forward-Backward SDEs for Min-max Control

Wang, Ziyi; Lee, Keuntaek; Pereira, Marcus A.; Theodorou, Evangelos A.

doi:10.1109/cdc40024.2019.9028871

Cited by 13 publications

(9 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is discussed with greater detail in the supplemental material, including some common instances of degeneracy. These degeneracies prove prohibitive for a variety of methods introduced in the stochastic optimal control literature, including path integral control [56][57][58][59], forwardbackward stochastic differential equations using importance sampling [60,61], and recently spatio-temporal stochastic optimization [62,63]. In each case, such degeneracies must be carefully addressed.…”

Section: Problem Formulationmentioning

confidence: 99%

Stochastic optimization for learning quantum state feedback control

Evans¹,

Wang²,

Frim³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

High fidelity state preparation represents a fundamental challenge in the application of quantum technology. While the majority of optimal control approaches use feedback to improve the controller, the controller itself often does not incorporate explicit state dependence. Here, we present a general framework for training deep feedback networks for open quantum systems with quantum nondemolition measurement that allows a variety of system and control structures that are prohibitive by many other techniques and can in effect react to unmodeled effects through nonlinear filtering. We demonstrate that this method is efficient due to inherent parallelizability, robust to open system interactions, and outperforms landmark state feedback control results in simulation.

show abstract

Section: Problem Formulationmentioning

confidence: 99%

Stochastic optimization for learning quantum state feedback control

Evans¹,

Wang²,

Frim³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…So far in the deep FBSDEs literature [8][9][10][11]17] a fixed initial state x[0] has been used for every batch index 𝑏 leading to the network only being able to solve the problem starting from x[0]. However, this is a very limiting assumption in practice, more so for the planetary soft-landing problem as the probability of the spacecraft being in a specific initial Algorithm 1 NOVAS-FBSDE with first-exit times 1: function _ (number of time steps (𝑁), altitude threshold (ℎ tol ), batch size (𝐵), time discretization (Δ𝑡), LSTM neural-network to predict 𝑉 x ( 𝑓 LSTM ), diffusion matrix (Σ), system drift ( 𝑓 ), Hamiltonian function (H ), running cost (𝑙), inputs and hyperparameters for NOVAS module (𝑁𝑂𝑉 𝐴𝑆_𝑖𝑛𝑝𝑢𝑡𝑠), initial value function network ( 𝑓 𝑉 0 ), networks to predict initial LSTM-states ( 𝑓 𝑖 𝑐 0 , 𝑓 𝑖 ℎ 0 ), number of LSTM hidden layers (𝐻), radius of base of glide-slope cone (𝑟𝑎𝑑), initial-state vector with uninitialized starting positions x[0])…”

Section: Training a Policy Network Invariant Of Initial Positionmentioning

confidence: 99%

Deep $\mathcal{L}^1$ Stochastic Optimal Control Policies for Planetary Soft-landing

Pereira¹,

Duarte²,

Theodorou³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, we introduce a novel deep learning based solution to the Powered-Descent Guidance problem, grounded in principles of nonlinear Stochastic Optimal Control and Feynman-Kac theory. Our algorithm solves the PDG problem by framing it as an L 1 SOC problem for minimum fuel consumption. Additionally, it can handle practically useful control constraints, nonlinear dynamics and enforces state constraints as soft-constraints. This is achieved by building off of recent work on deep Forward-Backward Stochastic Differential Equations and differentiable non-convex optimization neural-network layers based on stochastic search. In contrast to previous approaches, our algorithm does not require convexification of the constraints or linearization of the dynamics and is empirically shown to be robust to stochastic disturbances and the initial position of the spacecraft. After training offline, our controller can be activated once the spacecraft is within a pre-specified radius of the landing zone and at a pre-specified altitude i.e., the base of an inverted cone with the tip at the landing zone. We demonstrate empirically that our controller can successfully and safely land all trajectories initialized at the base of this cone while minimizing fuel consumption.

show abstract

“…A general approach for continuous-time Stochastic Optimal Control (SOC) relying on FBSDEs (Forward-Backward Stochastic Differential Equations) was recently combined with deep learning [23] to solve high-dimensional Hamilton-Jacobi-Bellman PDEs (HJB-PDEs). Most noteworthy being deep FBSDEs [33,34,47,11], a scalable framework for SOC problems that, at its core, leverages the function approximation capabilities of deep recurrent neural networks (specifically LSTMs) to learn the gradient of the value-function, which can then be used to compute optimal control policies.…”

Section: Introductionmentioning

confidence: 99%

Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and ADMM

Pereira¹,

D.²,

Oswin³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

In this work, we propose a novel safe and scalable decentralized solution for multi-agent control in the presence of stochastic disturbances. Safety is mathematically encoded using stochastic control barrier functions and safe controls are computed by solving quadratic programs. Decentralization is achieved by augmenting to each agent's optimization variables, copy variables, for its neighbors. This allows us to decouple the centralized multi-agent optimization problem. However, to ensure safety, neighboring agents must agree on what is safe for both of us and this creates a need for consensus. To enable safe consensus solutions, we incorporate an ADMM-based approach. Specifically, we propose a Merged CADMM-OSQP implicit neural network layer, that solves a mini-batch of both, local quadratic programs as well as the overall consensus problem, as a single optimization problem. This layer is embedded within a Deep FBSDEs network architecture at every time step, to facilitate endto-end differentiable, safe and decentralized stochastic optimal control. The efficacy of the proposed approach is demonstrated on several challenging multi-robot tasks in simulation. By imposing requirements on safety specified by collision avoidance constraints, the safe operation of all agents is ensured during the entire training process. We also demonstrate superior scalability in terms of computational and memory savings as compared to a centralized approach.

show abstract

Deep Forward-Backward SDEs for Min-max Control

Cited by 13 publications

References 19 publications

Stochastic optimization for learning quantum state feedback control

Stochastic optimization for learning quantum state feedback control

Deep $\mathcal{L}^1$ Stochastic Optimal Control Policies for Planetary Soft-landing

Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and ADMM

Contact Info

Product

Resources

About