2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.00117
|View full text |Cite
|
Sign up to set email alerts
|

Predicting with Confidence on Unseen Distributions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
40
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 60 publications
(42 citation statements)
references
References 15 publications
2
40
0
Order By: Relevance
“…( 1), when the model is well-calibrated, the average of the calibrated MSP [Hendrycks and Gimpel, 2016] TS [Guo et al, 2017] MD-TS MSP [Hendrycks and Gimpel, 2016] TS [Guo et al, 2017] confidence is close to the average accuracy, i.e., Conf(D) ≈ Acc(D). 2 Meanwhile, predicting model performance accurately is an essential ingredient in developing reliable machine learning systems, especially under distributional shifts [Guillory et al, 2021]. As shown in Table 1, we find that our proposed method produces well-calibrated confidence values on both InD and OOD domains.…”
Section: Predicting Generalizationmentioning
confidence: 77%
“…( 1), when the model is well-calibrated, the average of the calibrated MSP [Hendrycks and Gimpel, 2016] TS [Guo et al, 2017] MD-TS MSP [Hendrycks and Gimpel, 2016] TS [Guo et al, 2017] confidence is close to the average accuracy, i.e., Conf(D) ≈ Acc(D). 2 Meanwhile, predicting model performance accurately is an essential ingredient in developing reliable machine learning systems, especially under distributional shifts [Guillory et al, 2021]. As shown in Table 1, we find that our proposed method produces well-calibrated confidence values on both InD and OOD domains.…”
Section: Predicting Generalizationmentioning
confidence: 77%
“…A separate line of work departs from complexity measures altogether and directly predicts OOD generalization from unlabelled test data. These methods either predict the correctness of the model directly on individual examples [14,32,15] or directly estimate the total error [19,24,9,10,68]. Although these methods work well in practice, they do not provide any insight into the underlying mechanism of generalization since they act only on the output layer of the network.…”
Section: Related Workmentioning
confidence: 99%
“…Unsupervised model performance evaluation: Model performance evaluation without labels has received relatively limited attention. Domain-specific models' performance can be estimated via certain statistics, such as confidence score [17], rotation prediction [10] and feature statistics of the datasets sampled from a meta-dataset [11] for image recognition. General model evaluation often relies on different assumptions and accessibility [6,7,8,9,13,18,20,38].…”
Section: Introductionmentioning
confidence: 99%
“…Domain-specific models' performance can be estimated via certain statistics, such as confidence score [17], rotation prediction [10] and feature statistics of the datasets sampled from a meta-dataset [11] for image recognition. General model evaluation often relies on different assumptions and accessibility [6,7,8,9,13,18,20,38]. For example, [8] assumes covariate shift and requires users to provide an approximation (slice) of the shifted features, while [6] needs white-box access to the ML models to train an ensemble as a reference.…”
Section: Introductionmentioning
confidence: 99%