2021
DOI: 10.48550/arxiv.2102.13566
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sparse approximation in learning via neural ODEs

Abstract: We consider the continuous-time, neural ordinary differential equation (neural ODE) perspective of deep supervised learning, and study the impact of the final time horizon T in training. We focus on a cost consisting of an integral of the empirical risk over the time interval, and L 1 -parameter regularization. Under homogeneity assumptions on the dynamics (typical for ReLU activations), we prove that any global minimizer is sparse, in the sense that there exists a positive stopping time T * beyond which the o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…Such results are indeed shown, in specific settings ( 2 losses), in [55] -this is the theory presented in Section 10. We also refer the reader to [53] for results with general losses and L 1 (0, T ; R du ) control penalties, albeit with polynomial decay rates. A proof of the turnpike property (13.15) in the case of general losses with L 2 control penalties is an open problem.…”
Section: Remark 105 (Time-irreversible Equations)mentioning
confidence: 99%
See 1 more Smart Citation
“…Such results are indeed shown, in specific settings ( 2 losses), in [55] -this is the theory presented in Section 10. We also refer the reader to [53] for results with general losses and L 1 (0, T ; R du ) control penalties, albeit with polynomial decay rates. A proof of the turnpike property (13.15) in the case of general losses with L 2 control penalties is an open problem.…”
Section: Remark 105 (Time-irreversible Equations)mentioning
confidence: 99%
“…For instance, [85] penalize the TV-norm (in time) of the control and obtain an integral turnpike property for linear, first-order hyperbolic systems. In [53], a polynomial turnpike property is obtained for the optimal controls for finite-dimensional driftless nonlinear systems, when the L 1 -norm (in time) of the control is penalized, and a rather general cost is used for the state. And more specifically, when the L 1 -norm (in time) of the discrepancy of the state to the running target is penalized, the authors in [86] show, for finite-dimensional systems, that a finite-time turnpike property occurs for the state, namely, the L 1 norm is saturated and the state reaches the turnpike exactly in finite time.…”
Section: Part IV Epiloguementioning
confidence: 99%