2021
DOI: 10.48550/arxiv.2111.11954
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Depth induces scale-averaging in overparameterized linear Bayesian neural networks

Jacob A. Zavatone-Veth,
Cengiz Pehlevan

Abstract: Inference in deep Bayesian neural networks is only fully understood in the infinite-width limit, where the posterior flexibility afforded by increased depth washes out and the posterior predictive collapses to a shallow Gaussian process. Here, we interpret finite deep linear Bayesian neural networks as datadependent scale mixtures of Gaussian process predictors across output channels. We leverage this observation to study representation learning in these networks, allowing us to connect limiting results obtain… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
10
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(12 citation statements)
references
References 18 publications
2
10
0
Order By: Relevance
“…Using the replica trick [45,46], we compute learning curves for simple linear regression, deep linear Gaussian random feature (RF) models, and deep linear NNs. Our results are obtained using an isotropic Gaussian likelihood in the limit of small likelihood variance, which renders this analysis analytically tractable [33,34]. Using alternative replica-free methods and numerical simulation, we show that the predictions obtained under a replica-symmetric (RS) Ansatz are accurate for all three model classes.…”
Section: Introductionmentioning
confidence: 93%
See 4 more Smart Citations
“…Using the replica trick [45,46], we compute learning curves for simple linear regression, deep linear Gaussian random feature (RF) models, and deep linear NNs. Our results are obtained using an isotropic Gaussian likelihood in the limit of small likelihood variance, which renders this analysis analytically tractable [33,34]. Using alternative replica-free methods and numerical simulation, we show that the predictions obtained under a replica-symmetric (RS) Ansatz are accurate for all three model classes.…”
Section: Introductionmentioning
confidence: 93%
“…In the noise-free case, this limiting likelihood is matched to the true generative model of the data, but it is clearly mismatched in the presence of label noise. This limit has been considered in several recent studies of deep linear Bayesian neural networks [4,30,[33][34][35].…”
Section: Generalization Error In the Thermodynamic Limitmentioning
confidence: 99%
See 3 more Smart Citations