2020
DOI: 10.48550/arxiv.2012.10988
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Post-hoc Uncertainty Calibration for Domain Drift Scenarios

Abstract: We address the problem of uncertainty calibration. While standard deep neural networks typically yield uncalibrated predictions, calibrated confidence scores that are representative of the true likelihood of a prediction can be achieved using post-hoc calibration methods. However, to date the focus of these approaches has been on in-domain calibration. Our contribution is two-fold. First, we show that existing post-hoc calibration methods yield highly over-confident predictions under domain shift. Second, we i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…A method is considered to be calibrated (or reliable) if the confidence of predictions matches the probability of being correct for all confidence levels 22,45 . Formally, the empirical and predicted CDFs should be identical 46…”
Section: Definitions: Calibration and Sharpnessmentioning
confidence: 99%
See 1 more Smart Citation
“…A method is considered to be calibrated (or reliable) if the confidence of predictions matches the probability of being correct for all confidence levels 22,45 . Formally, the empirical and predicted CDFs should be identical 46…”
Section: Definitions: Calibration and Sharpnessmentioning
confidence: 99%
“…Two major concepts resulting from these studies are calibration (reliability), and sharpness (resolution). A probabilistic prediction method is said to be calibrated if the confidence of predictions matches the probability of being correct for all confidence levels 45 . A calibrated method is sharp if it produces the tightest possible confidence intervals 46 .…”
Section: Introductionmentioning
confidence: 99%
“…Intervals-based testing. In the CS framework, a method is considered to be calibrated if the confidence of its predictions matches the probability of being correct for all confidence levels, 5,37 which can be reformulated as "prediction intervals should have the correct coverage". 21 It is convenient here to deal with prediction errors instead of predicted values, and one defines the prediction interval coverage probability (PICP) as 38…”
Section: Calibrationmentioning
confidence: 99%
“…This is naturally a substantial problem in datasets with few samples and a high amount of features, such as image, language (with high dimensional embedding spaces) or long multivariate time series datasets. Nevertheless, this issue is distinct from the field of out-of-distribution (OOD) samples, where a rich body of research exists already [33,36,20,35,42,34]. The difference to in-domain scenarios is that OOD samples are considered to not be lying on the input data manifold.…”
Section: Introductionmentioning
confidence: 99%