2021
DOI: 10.48550/arxiv.2111.12143
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Critical Initialization of Wide and Deep Neural Networks through Partial Jacobians: General Theory and Applications

Abstract: Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new way… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 15 publications
0
0
0
Order By: Relevance
“…It is known that, when the number of parameters of the layers tends to infinity, this becomes a Gaussian process where a quantitatively predictive description is possible. Doshi et al [30] demonstrate a new method for diagnosis. To do this, they induced the partial Jacobians of the neural network, defined as the derivatives of preactivations.…”
Section: Introductionmentioning
confidence: 99%
“…It is known that, when the number of parameters of the layers tends to infinity, this becomes a Gaussian process where a quantitatively predictive description is possible. Doshi et al [30] demonstrate a new method for diagnosis. To do this, they induced the partial Jacobians of the neural network, defined as the derivatives of preactivations.…”
Section: Introductionmentioning
confidence: 99%