2020
DOI: 10.1142/s0219530520400023
|View full text |Cite
|
Sign up to set email alerts
|

Neural ODEs as the deep limit of ResNets with constant weights

Abstract: In this paper we prove that, in the deep limit, the stochastic gradient descent on a ResNet type deep neural network, where each layer share the same weight matrix, converges to the stochastic gradient descent for a Neural ODE and that the corresponding value/loss functions converge. Our result gives, in the context of minimization by stochastic gradient descent, a theoretical foundation for considering Neural ODEs as the deep limit of ResNets. Our proof is based on certain decay estimates for associated Fokke… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(17 citation statements)
references
References 21 publications
0
17
0
Order By: Relevance
“…Later on, this dynamical approach has been greatly popularized in the machine learning community under the name of NeurODE by Chen et al [27], see also [52]. The formulation starts by re-interpreting the iteration (1.2) as a discrete-time Euler approximation [9] of the following dynamical system Ẋt = F(t, X t , θ t ) ,…”
Section: Neurodes and Stochastic Optimal Controlmentioning
confidence: 99%
“…Later on, this dynamical approach has been greatly popularized in the machine learning community under the name of NeurODE by Chen et al [27], see also [52]. The formulation starts by re-interpreting the iteration (1.2) as a discrete-time Euler approximation [9] of the following dynamical system Ẋt = F(t, X t , θ t ) ,…”
Section: Neurodes and Stochastic Optimal Controlmentioning
confidence: 99%
“…(3) Mean field analysis of (stochastic) gradient descent for two-layer neural networks [16,45,54,61] and multi-layer fully connected networks [2,49,60]. (4) The function space work [24,25]. Also related are the work in [3,4,7,53,64].…”
Section: Introductionmentioning
confidence: 99%
“…(4) The function space work [24,25]. Also related are the work in [3,4,7,53,64]. The work presented here is a natural extension of these ideas.…”
Section: Introductionmentioning
confidence: 99%
“…Since the initial proposal of residual networks [26], many works have studied them theoretically, observing that the forward pass of a residual network resembles the explicit Euler scheme of an ordinary differential equation [30][31][32]. The question of stability, invertibility and reusability of the convolutional filters became central [33,34].…”
Section: Recurrent Residual Network As Odementioning
confidence: 99%