2021
DOI: 10.48550/arxiv.2104.14421
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

What Are Bayesian Neural Network Posteriors Really Like?

Pavel Izmailov,
Sharad Vikram,
Matthew D. Hoffman
et al.

Abstract: The posterior over Bayesian neural network (BNN) parameters is extremely high-dimensional and non-convex. For computational reasons, researchers approximate this posterior using inexpensive mini-batch methods such as meanfield variational inference or stochastic-gradient Markov chain Monte Carlo (SGMCMC). To investigate foundational questions in Bayesian deep learning, we instead use full-batch Hamiltonian Monte Carlo (HMC) on modern architectures. We show that (1) BNNs can achieve significant performance gain… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
37
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(38 citation statements)
references
References 43 publications
1
37
0
Order By: Relevance
“…We ran experiments on Bayesian NN regression, classification, logistic regression and ICA (Amari et al, 1996), reporting accuracies, log joints (Welling and Teh, 2011;Izmailov et al, 2021) and expected calibration error (ECE) (Guo et al, 2017). For details on exact experimental setups please see Appendix F. Across experiments we compare to SGLD as (Izmailov et al, 2021). In the Bayesian NN tasks the likelihood is parametrised via p(y…”
Section: Resultsmentioning
confidence: 99%
“…We ran experiments on Bayesian NN regression, classification, logistic regression and ICA (Amari et al, 1996), reporting accuracies, log joints (Welling and Teh, 2011;Izmailov et al, 2021) and expected calibration error (ECE) (Guo et al, 2017). For details on exact experimental setups please see Appendix F. Across experiments we compare to SGLD as (Izmailov et al, 2021). In the Bayesian NN tasks the likelihood is parametrised via p(y…”
Section: Resultsmentioning
confidence: 99%
“…We measure the quality of each sampling method's approximation to the predictive distribution corresponding to the true posterior. We generate the ground truth predictive distribution by running 20,000 samples of SGLD (Welling & Teh, 2011), and follow (Izmailov et al, 2021) by measuring the top-1 agreement and total variation with respect to the ground truth predictive distribution. Total variation is…”
Section: Bayesian Neural Network Subspace Inferencementioning
confidence: 99%
“…The task is 10-class image classification, and we use a ResNet-20 with Filter Response Normalization from (Izmailov et al, 2021) as our base model. Our results in Figure 5 demonstrate that our online thinning method outperforms both the baseline sampler and SPMCMC-based samplers on the agreement metric.…”
Section: Cifar-10 Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…This implies that good Bayesian inference can only be made if the statistics of y are correctly modeled. For example, stochastic neural networks are expected to have well-calibrated uncertainty estimates, a trait that is highly desirable for practical safe and reliable applications (Wilson & Izmailov, 2020;Gawlikowski et al, 2021;Izmailov et al, 2021). This expectation means that a well-trained stochastic network should have a predictive variance that matches the actual level of randomness in the labeling.…”
Section: Related Workmentioning
confidence: 99%