2022
DOI: 10.3389/frai.2022.889981
|View full text |Cite
|
Sign up to set email alerts
|

Shallow Univariate ReLU Networks as Splines: Initialization, Loss Surface, Hessian, and Gradient Flow Dynamics

Abstract: Understanding the learning dynamics and inductive bias of neural networks (NNs) is hindered by the opacity of the relationship between NN parameters and the function represented. Partially, this is due to symmetries inherent within the NN parameterization, allowing multiple different parameter settings to result in an identical output function, resulting in both an unclear relationship and redundant degrees of freedom. The NN parameterization is invariant under two symmetries: permutation of the neurons and a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 39 publications
0
1
0
Order By: Relevance
“…The idea behind the implicit regularization is that the loss landscape of a network has many minima, and which minimum one converges to after training depends on many factors, including the choice of model architecture and parametrization [37], [38], the initialization scheme [39] and the optimization algorithm [40], [41], [42]. The implicit regularization of state-of-the-art models has been shown to play a critical role in the generalization of deep neural networks [43], [44].…”
Section: Implicit Regularization and Data Augmentationmentioning
confidence: 99%
See 1 more Smart Citation
“…The idea behind the implicit regularization is that the loss landscape of a network has many minima, and which minimum one converges to after training depends on many factors, including the choice of model architecture and parametrization [37], [38], the initialization scheme [39] and the optimization algorithm [40], [41], [42]. The implicit regularization of state-of-the-art models has been shown to play a critical role in the generalization of deep neural networks [43], [44].…”
Section: Implicit Regularization and Data Augmentationmentioning
confidence: 99%
“…• Subsection III-C analyzes how different data augmentations pose constraints for the learned weights in Fourier domain. This can be seen as an aspect of the so called network implicit bias (see [37], [38], [39], [40], [41], [42], [43], [44]). In section IV we test our theoretical results in a simple task of classification on MNIST.…”
Section: Introduction and Previous Workmentioning
confidence: 99%
“…Neural models are able to represent complex, non-linear functions with reasonable computational costs. More recently, kernelized version of NNs have been developed that restrict the massive expressive power of NNs while still capturing nonlinear relationships, and also control the smoothness of the resulting predictive models 14 .…”
Section: Description Of Machine Learningmentioning
confidence: 99%
“…In addressing the aforementioned question, we adopt a similar, but more general, approach that relies on the concept of “implicit bias.” Implicit bias in machine learning refers to the phenomenon where the training process of an overparameterized network, influenced by factors including the choice of model architecture and parametrization (Gunasekar et al, 2018 ; Yun et al, 2020 ), the initialization scheme (Sahs et al, 2020a ), and the optimization algorithm (Williams et al, 2019 ; Sahs et al, 2020b ; Woodworth et al, 2020 ), naturally favors certain solutions or patterns over others, even in the absence of explicit bias in the training data. The implicit bias of state-of-the-art models has been shown to play a critical role in the generalization of deep neural networks (Arora et al, 2019 ; Li et al, 2019 ).…”
Section: Introductionmentioning
confidence: 99%