2019 IEEE 58th Conference on Decision and Control (CDC) 2019
DOI: 10.1109/cdc40024.2019.9028871
|View full text |Cite
|
Sign up to set email alerts
|

Deep Forward-Backward SDEs for Min-max Control

Abstract: This paper presents a novel approach to numerically solve stochastic differential games for nonlinear systems. The proposed approach relies on the nonlinear Feynman-Kac theorem that establishes a connection between parabolic deterministic partial differential equations and forward-backward stochastic differential equations. Using this theorem the Hamilton-Jacobi-Isaacs partial differential equation associated with differential games is represented by a system of forwardbackward stochastic differential equation… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
9
1

Relationship

3
7

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…This is discussed with greater detail in the supplemental material, including some common instances of degeneracy. These degeneracies prove prohibitive for a variety of methods introduced in the stochastic optimal control literature, including path integral control [56][57][58][59], forwardbackward stochastic differential equations using importance sampling [60,61], and recently spatio-temporal stochastic optimization [62,63]. In each case, such degeneracies must be carefully addressed.…”
Section: Problem Formulationmentioning
confidence: 99%
“…This is discussed with greater detail in the supplemental material, including some common instances of degeneracy. These degeneracies prove prohibitive for a variety of methods introduced in the stochastic optimal control literature, including path integral control [56][57][58][59], forwardbackward stochastic differential equations using importance sampling [60,61], and recently spatio-temporal stochastic optimization [62,63]. In each case, such degeneracies must be carefully addressed.…”
Section: Problem Formulationmentioning
confidence: 99%
“…So far in the deep FBSDEs literature [8][9][10][11]17] a fixed initial state x[0] has been used for every batch index 𝑏 leading to the network only being able to solve the problem starting from x[0]. However, this is a very limiting assumption in practice, more so for the planetary soft-landing problem as the probability of the spacecraft being in a specific initial Algorithm 1 NOVAS-FBSDE with first-exit times 1: function _ (number of time steps (𝑁), altitude threshold (β„Ž tol ), batch size (𝐡), time discretization (Δ𝑑), LSTM neural-network to predict 𝑉 x ( 𝑓 LSTM ), diffusion matrix (Ξ£), system drift ( 𝑓 ), Hamiltonian function (H ), running cost (𝑙), inputs and hyperparameters for NOVAS module (𝑁𝑂𝑉 𝐴𝑆_𝑖𝑛𝑝𝑒𝑑𝑠), initial value function network ( 𝑓 𝑉 0 ), networks to predict initial LSTM-states ( 𝑓 𝑖 𝑐 0 , 𝑓 𝑖 β„Ž 0 ), number of LSTM hidden layers (𝐻), radius of base of glide-slope cone (π‘Ÿπ‘Žπ‘‘), initial-state vector with uninitialized starting positions x[0])…”
Section: Training a Policy Network Invariant Of Initial Positionmentioning
confidence: 99%
“…A general approach for continuous-time Stochastic Optimal Control (SOC) relying on FBSDEs (Forward-Backward Stochastic Differential Equations) was recently combined with deep learning [23] to solve high-dimensional Hamilton-Jacobi-Bellman PDEs (HJB-PDEs). Most noteworthy being deep FBSDEs [33,34,47,11], a scalable framework for SOC problems that, at its core, leverages the function approximation capabilities of deep recurrent neural networks (specifically LSTMs) to learn the gradient of the value-function, which can then be used to compute optimal control policies.…”
Section: Introductionmentioning
confidence: 99%