2017
DOI: 10.48550/arxiv.1709.03698
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reversible Architectures for Arbitrarily Deep Residual Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
30
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 17 publications
(31 citation statements)
references
References 22 publications
1
30
0
Order By: Relevance
“…Sonoda & Murata (2017) and Li & Shi (2017) also regarded ResNet as dynamic systems that are the characteristic lines of a transport equation on the distribution of the data set. Similar observations were also made by Chang et al (2017); they designed a reversible architecture to grant stability to the dynamic system. On the other hand, many deep network designs were inspired by optimization algorithms, such as the network LISTA (Gregor & LeCun, 2010) and the ADMM-Net (Yang et al, 2016).…”
Section: Related Worksupporting
confidence: 63%
“…Sonoda & Murata (2017) and Li & Shi (2017) also regarded ResNet as dynamic systems that are the characteristic lines of a transport equation on the distribution of the data set. Similar observations were also made by Chang et al (2017); they designed a reversible architecture to grant stability to the dynamic system. On the other hand, many deep network designs were inspired by optimization algorithms, such as the network LISTA (Gregor & LeCun, 2010) and the ADMM-Net (Yang et al, 2016).…”
Section: Related Worksupporting
confidence: 63%
“…One can view a ghost element as pseudo element that lies outside the domain used to control the gradient. For example with a k = 2 architecture from Equation 4b, one needs the initial position and velocity in order to be able to define x (2) as a function of x (0) and x (1) . In the next subsection it will be shown that the dense network [7] can be interpreted as the interior/ghost elements needed to initialize the dynamical equation.…”
Section: Architectures Induced From Smooth Transformationsmentioning
confidence: 99%
“…Work by Chang et al [3] considered residual neural networks as forward difference approximations to C 1 transformations as well. This work has been extended to develop new network architectures by using central differencing, as opposed to forward differencing, to approximate the set of coupled first order differential equations, called the Midpoint Network [2]. Similarly, other researchers have used different numerical schemes to approximate the first order ordinary differential equations, such as the linear multistep method to develop the Linear Multistep-architecture [10].…”
Section: Introductionmentioning
confidence: 99%
“…T HE connection between dynamical systems and neural network models has been widely studied in the literature, see, for example, [1]- [5]. In general, neural networks can be considered as discrete dynamical systems with the basic dynamics at each step being a linear transformation followed by a component-wise nonlinear (activation) function.…”
Section: Introductionmentioning
confidence: 99%
“…In recent works [14]- [19], the primary focus has been to solve the inverse problem, i.e., identifying Hamiltonian systems from data, using structured neural networks. For example, HNNs [15] use a neural network H to approximate the Hamiltonian H in (1), then learn H by reformulating the loss function. Based on HNNs, other models were proposed to tackle problems in generative modeling [16], [19] and continuous control [20].…”
Section: Introductionmentioning
confidence: 99%