2022
DOI: 10.48550/arxiv.2202.04777
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Exact Solutions of a Deep Linear Network

Abstract: This work finds the exact solutions to a deep linear network with weight decay and stochastic neurons, a fundamental model for understanding the landscape of neural networks. Our result implies that weight decay strongly interacts with the model architecture and can create bad minima in a network with more than 1 hidden layer, qualitatively different for a network with only 1 hidden layer. As an application, we also analyze stochastic nets and show that their prediction variance vanishes to zero as the stochas… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(11 citation statements)
references
References 14 publications
0
11
0
Order By: Relevance
“…The existing initialization methods are predominantly data-dependent. However, our result (also see [24]) suggests that the size of the trivial minimum is data-dependent, and our result thus highlights the importance of designing data-dependent initialization methods in deep learning.…”
Section: A1 Sensitivity To the Initial Conditionmentioning
confidence: 57%
See 4 more Smart Citations
“…The existing initialization methods are predominantly data-dependent. However, our result (also see [24]) suggests that the size of the trivial minimum is data-dependent, and our result thus highlights the importance of designing data-dependent initialization methods in deep learning.…”
Section: A1 Sensitivity To the Initial Conditionmentioning
confidence: 57%
“…where x is the input data, y the label, U and W (i) the model parameters, D the network depth, the noise in the hidden layer (e.g., dropout), d 0 the width of the model, and γ the weight decay strength. We build on the recent results established in [24]. Let b ∶= U d 0 .…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations