2021
DOI: 10.48550/arxiv.2110.06581
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful

Abstract: We present extensive empirical evidence showing that current Bayesian simulation-based inference algorithms are inadequate for the falsificationist methodology of scientific inquiry. Our results collected through months of experimental computations show that all benchmarked algorithms -(s)npe, (s)nre, snl and variants of abc -may produce overconfident posterior approximations, which makes them demonstrably unreliable and dangerous if one's scientific goal is to constrain parameters of interest. We believe that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 29 publications
(27 citation statements)
references
References 20 publications
0
25
0
Order By: Relevance
“…But the uncertainty estimated by the emulator should be greater than the phenomenological model because of the limited training data. The problem of the underestimated uncertainty may come from the nature of the NF emulator [57]. NF is a series of continuous transformations so has relatively bad performance on learning truncation property.…”
Section: Discussionmentioning
confidence: 99%
“…But the uncertainty estimated by the emulator should be greater than the phenomenological model because of the limited training data. The problem of the underestimated uncertainty may come from the nature of the NF emulator [57]. NF is a series of continuous transformations so has relatively bad performance on learning truncation property.…”
Section: Discussionmentioning
confidence: 99%
“…Training parametrized classifiers and density estimators is a challenging task in machine learning, and often requires a large number of simulations and/or a post-hoc calibration procedure in order to produce satisfactory results. It was recently demonstrated that typically-used SBI algorithms tend to produce overly confident posterior estimates [143]an unacceptable outcome in cosmological applications. It is therefore imperative to perform diagnostic coverage tests (as done in, e.g., Refs.…”
Section: Uncertainty Quantification and Biasmentioning
confidence: 99%

Machine Learning and Cosmology

Dvorkin,
Mishra-Sharma,
Nord
et al. 2022
Preprint
“…Following Ref. [18], the expected coverage probability of the 1 − α highest posterior density region (HPDR) of some estimated posterior p(ϑ|x) is given by…”
Section: "Getting It Right" Through Coverage Tests For Simulation-bas...mentioning
confidence: 99%
“…This quantity can be interpreted as both expected Bayesian credibility as well as expected Frequentist coverage probability of p(ϑ|x) [18]. It enables us to estimate the actual error rate α of a 1 − α highest posterior density region of some estimated posterior p(ϑ|x).…”
Section: "Getting It Right" Through Coverage Tests For Simulation-bas...mentioning
confidence: 99%
See 1 more Smart Citation