2021
DOI: 10.1017/s0956792521000139
|View full text |Cite
|
Sign up to set email alerts
|

Structure-preserving deep learning

Abstract: Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the trade-off between computational effort, amount of data and model complexity is required to successfully design a deep learning a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(20 citation statements)
references
References 72 publications
0
20
0
Order By: Relevance
“…Differential geometry plays a fundamental role in applied mathematics, statistics, and computer science, including numerical integration [1][2][3][4][5], optimisation [6][7][8][9][10][11], sampling [12][13][14][15][16], statistics on spaces with deep learning [17,18], medical imaging and shape methods [19,20], interpolation [21], and the study of random maps [22], to name a few. Of particular relevance to this chapter is information geometry, i.e., the differential geometric treatment of smooth statistical manifolds, whose origin stems from a seminal article by Rao [23] who introduced the Fisher metric tensor on parametrised statistical models, and thus a natural Riemannian geometry that was later observed to correspond to an infinitesimal distance with respect to the Kullback-Leibler (KL) divergence [24].…”
Section: Introductionmentioning
confidence: 99%
“…Differential geometry plays a fundamental role in applied mathematics, statistics, and computer science, including numerical integration [1][2][3][4][5], optimisation [6][7][8][9][10][11], sampling [12][13][14][15][16], statistics on spaces with deep learning [17,18], medical imaging and shape methods [19,20], interpolation [21], and the study of random maps [22], to name a few. Of particular relevance to this chapter is information geometry, i.e., the differential geometric treatment of smooth statistical manifolds, whose origin stems from a seminal article by Rao [23] who introduced the Fisher metric tensor on parametrised statistical models, and thus a natural Riemannian geometry that was later observed to correspond to an infinitesimal distance with respect to the Kullback-Leibler (KL) divergence [24].…”
Section: Introductionmentioning
confidence: 99%
“…• First of all, we ought to suppose that x (i) = x (j) for i = j. Now, due to the uniqueness of Lipschitz-nonlinear ODEs (in both directions of time), trajectories corresponding to different initial data cannot cross 34 . Hence, in the context of binary classification tasks for instance (namely, where f is the characteristic function of some set), if the original dataset is not linearly separable, one cannot separate the dataset by a controlled neural ODE flow in a way that the underlying topology of the data (namely, the unknown function f ) is captured and generalized.…”
Section: Remark 105 (Time-irreversible Equations)mentioning
confidence: 99%
“…The neural ODE formalism of deep learning has been used to great effect in several machine learning contexts. To name a few, these include the use of adaptive ODE solvers ( [35,49,107]) and symplectic schemes ( [34]) for efficient training, the use of indirect training algorithms based on the Pontryagin Maximum Principle ( [119,15]), image superresolution ( [92]), as well as unsupervised learning and generative modeling ( [72,137]). The origins of continuous-time supervised learning date back at least to [117], in which the backpropagation method is connected to the adjoint method.…”
Section: Remark 105 (Time-irreversible Equations)mentioning
confidence: 99%
“…In particular, they take as a point of departure this variational approach that captures acceleration in continuous time considering a particular type of time-dependent Lagrangian functions, called Bregman Lagrangians (see Section 2), In a recent paper [3], the authors introduce symplectic integrators (and also presymplectic integrators) in the integration of the differential equations associated with accelerated optimizations methods (see references [27,12,5] for an introduction to symplectic integration). In [3] the authors uses the Hamiltonian formalism since it is possible to extend the phase space formalism to turn the system into a time-independent hamiltonian systems and apply there the standard symplectic techniques (see [20,9]). See recent improvements of this approach using adaptative hamiltonian variational integrators [11].…”
Section: Introductionmentioning
confidence: 99%