2021
DOI: 10.48550/arxiv.2106.00651
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Asymptotics of representation learning in finite Bayesian neural networks

Abstract: Recent works have suggested that finite Bayesian neural networks may outperform their infinite cousins because finite networks can flexibly adapt their internal representations. However, our theoretical understanding of how the learned hidden layer representations of finite networks differ from the fixed representations of infinite networks remains incomplete. Perturbative finite-width corrections to the network prior and posterior have been studied, but the asymptotics of learned features have not been fully … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
21
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(25 citation statements)
references
References 15 publications
4
21
0
Order By: Relevance
“…We view our approaches as complementary, and warmly recommend [2] for readers interested in a pedagogical treatment of many ideas at this interface of theoretical physics and machine learning. We note that finite-width corrections to the feature kernel were also considered recently in [45].…”
Section: Relation To Other Workmentioning
confidence: 68%
See 2 more Smart Citations
“…We view our approaches as complementary, and warmly recommend [2] for readers interested in a pedagogical treatment of many ideas at this interface of theoretical physics and machine learning. We note that finite-width corrections to the feature kernel were also considered recently in [45].…”
Section: Relation To Other Workmentioning
confidence: 68%
“…In other words, in the phase space parametrized by (σ 2 w , σ 2 b ), the line σ2 w = γ 2 lies further to the right than the line σ 2 w,eff = γ 2 , past which the expression for the two-point function becomes complex. 45 Note that, as we discussed in subsec. 4.3.1, this does not imply that the true critical point does not exhibit a shift (and indeed, empirical observations demonstrate that there are sizable corrections to the result from the CLT; see, e.g., [17,18]), but simply that the critical point happens to lie at the edge of the strong coupling regime at which the present perturbative treatment breaks down.…”
Section: The Loop-corrected Correlation Lengthmentioning
confidence: 82%
See 1 more Smart Citation
“…As a result, a growing number of recent works have aimed to study the behavior of networks near the kernel limit, with the hope that leading-order corrections to the large-width behavior might elucidate how width and depth affect inference [4,[27][28][29][30][31][32][33][34][35][36][37][38][39]. Some of these works focus on the properties of the function-space prior distribution [27][28][29][30][31][32], some consider maximum-likelihood inference with gradient descent [30,38,39], and some consider properties of the full Bayes posterior [4,28,30,[33][34][35][36][37].…”
Section: Introductionmentioning
confidence: 99%
“…As a result, a growing number of recent works have aimed to study the behavior of networks near the kernel limit, with the hope that leading-order corrections to the large-width behavior might elucidate how width and depth affect inference [4,[27][28][29][30][31][32][33][34][35][36][37][38][39]. Some of these works focus on the properties of the function-space prior distribution [27][28][29][30][31][32], some consider maximum-likelihood inference with gradient descent [30,38,39], and some consider properties of the full Bayes posterior [4,28,30,[33][34][35][36][37]. This body of research has resulted in several conjectural conditions under which when narrower and deeper networks might perform better than their infinitely-wide cousins in the Bayesian setting, as measured by generalization for fixed data [33,35,36] or by some alternative criterion based on entropic considerations [30].…”
Section: Introductionmentioning
confidence: 99%