2017
DOI: 10.48550/arxiv.1710.10121
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

Yiping Lu,
Aoxiao Zhong,
Quanzheng Li
et al.

Abstract: Deep neural networks have become the state-of-the-art models in numerous machine learning tasks. However, general guidance to network architecture design is still missing. In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep arch… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
71
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 47 publications
(71 citation statements)
references
References 26 publications
0
71
0
Order By: Relevance
“…Recent work on residual networks [Lu et al, 2017, Haber and Ruthotto, 2017, Ruthotto and Haber, 2018 interpret residual connections as an Euler discretization of a continuous transformation through time. Motivated by this interpretation, Chen et al [2018] generalized residual networks by using more sophisticated black-box ODE solvers such as dopri5 [Dormand and Prince, 1980].…”
Section: Neural Ordinary Differential Equationsmentioning
confidence: 99%
“…Recent work on residual networks [Lu et al, 2017, Haber and Ruthotto, 2017, Ruthotto and Haber, 2018 interpret residual connections as an Euler discretization of a continuous transformation through time. Motivated by this interpretation, Chen et al [2018] generalized residual networks by using more sophisticated black-box ODE solvers such as dopri5 [Dormand and Prince, 1980].…”
Section: Neural Ordinary Differential Equationsmentioning
confidence: 99%
“…Copyright 2021 by the author(s). 2017; Lu et al, 2017) to physics (Greydanus et al, 2019). For instance, casting residual networks (He et al, 2016) as a discretization of ordinary differential equations enables fundamental reasoning on the loss landscape (Lu et al, 2020) and inspires new architectures with numerical stability or continuous limit (Chang et al, 2018;Chen et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Introduction. Artificial neural networks are appearing as the state-of-the-art technologies in many machine learning tasks including, but not limited to, computer vision, speech recognition, and natural language processing [20,29], among which the residual network (ResNet) and its numerous variants (see [9,7,14,25,36] and references cited therein) have attracted broad attention for their simplicity and effectiveness. In addition, ResNet makes it possible to train up to hundreds or even thousands of layers without performance degradation [10].…”
mentioning
confidence: 99%
“…However, deeper networks usually require extensive training, thereby impeding their real-time applications. To circumvent this issue as well as improve the generalization capability of trained models, the use of stochastic training techniques has become widespread in deep learning community [15,25]. Unfortunately, design of the data-oriented network architecture for real-world learning problems is often more art than science, e.g., injection of the dropout layer into ResNet-like models [10,36], tuning of the dropout probability and model hyperparameters [15,32], etc.…”
mentioning
confidence: 99%
See 1 more Smart Citation