2020
DOI: 10.48550/arxiv.2010.09629
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PAC$^m$-Bayes: Narrowing the Empirical Risk Gap in the Misspecified Bayesian Regime

Abstract: While the decision-theoretic optimality of the Bayesian formalism under correct model specification is well-known [Berger, 2013], the Bayesian case becomes less clear under model misspecification [Grünwald et al., 2017, Ramamoorthi et al., 2015, Fushiki et al., 2005. To formally understand the consequences of Bayesian misspecification, this work examines the relationship between posterior predictive risk and its sensitivity to correct model assumptions, i.e., choice of likelihood and prior. We present the mul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
30
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(30 citation statements)
references
References 6 publications
0
30
0
Order By: Relevance
“…The underlying in-distribution (ID) measure ν(x) (dashed line -a mixture of Gaussians) produces the data points in black, while the contaminating out-ofdistribution (OOD) measure ξ(x) produces the data point in red. The ensemble predictive distributions minimizing the conventional energy free criterion is denoted as J ; the one minimizing the m-free energy of [7] with m = 10 by J 10 ; and the ones minimizing the proposed robust m-free energy with m = 10 and t = {0.9, 0.7, 0.5} by J 10 0.9 , J 10 0.7 , and J 10 0.5 , respectively.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…The underlying in-distribution (ID) measure ν(x) (dashed line -a mixture of Gaussians) produces the data points in black, while the contaminating out-ofdistribution (OOD) measure ξ(x) produces the data point in red. The ensemble predictive distributions minimizing the conventional energy free criterion is denoted as J ; the one minimizing the m-free energy of [7] with m = 10 by J 10 ; and the ones minimizing the proposed robust m-free energy with m = 10 and t = {0.9, 0.7, 0.5} by J 10 0.9 , J 10 0.7 , and J 10 0.5 , respectively.…”
Section: Introductionmentioning
confidence: 99%
“…Recent work has addressed the problem of model misspecification for Bayesian learning. In particular, references [7], [8] have argued that the minimization of the standard free energy criterion yields predictive distributions that do not take advantage of ensembling, and thus have poor generalization capabilities for misspecified models. To illustrate this issue, consider the example in Figure 1.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…In a similar vein, Savage's theorem [25], which promises us optimal decisions under Bayesian decision theory, breaks down under prior misspecification [26]. Finally, it can even be shown that PAC-Bayesian inference can exceed the Bayesian one in terms of generalization performance when the prior is misspecified [27,28].…”
Section: Introductionmentioning
confidence: 99%