Post-hoc Uncertainty Calibration for Domain Drift Scenarios

Tomani, Christian; Gruber, Sebastian; Erdem, Muhammed Ebrar; Cremers, Daniel; Buettner, Florian

doi:10.1109/cvpr46437.2021.00999

Cited by 39 publications

(42 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Calibration Performance under Dataset Drift: Tomani et al [52] show that DNNs are over-confident and highly uncalibrated under dataset/domain shift. Our experiments shows that a model trained with MDCA fairs well in terms of calibration performance even under non-semantic/natural domain shift.…”

Section: Resultsmentioning

confidence: 99%

“…Following[31], we also divide the training set of photo domain into 9 : 1 train/val set. Rotated MNIST Dataset: This dataset is also used for domain shift experiments.Inspired from[52], we create 5 different test sets namely {M 15 , M 30 , M 45 , M 60 , M 75 }. Domain drift is introduced in each M x by rotating the images in the MNIST test set by x degrees counter-clockwise.…”

mentioning

confidence: 99%

See 1 more Smart Citation

A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration

Hebbalaguppe¹,

Prakash²,

Madan³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep Neural Networks (DNNs) are known to make overconfident mistakes, which makes their use problematic in safety-critical applications. State-of-the-art (SOTA) calibration techniques improve on the confidence of predicted labels alone, and leave the confidence of non-max classes (e.g. top-2, top-5) uncalibrated. Such calibration is not suitable for § Equal contribution label refinement using post-processing. Further, most SOTA techniques learn a few hyper-parameters post-hoc, leaving out the scope for image, or pixel specific calibration. This makes them unsuitable for calibration under domain shift, or for dense prediction tasks like semantic segmentation. In this paper, we argue for intervening at the train time itself, so as to directly produce calibrated DNN models. We propose a novel auxiliary loss function: Multi-class Difference in Confidence and Accuracy (MDCA), to achieve the same.

show abstract

Section: Resultsmentioning

confidence: 99%

mentioning

confidence: 99%

A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration

Hebbalaguppe¹,

Prakash²,

Madan³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…A straightforward yet efficient strategy to mitigate mis-calibrated predictions is to include a post-processing step, which transforms the probability predictions of a deep network [5,8,30,34]. Among these methods, temperature scaling [8], a variant of Platt scaling [28], employs a single scalar parameter over all the presoftmax activations, which results in softened class predictions.…”

Section: Related Workmentioning

confidence: 99%

“…Despite its good performance on in-domain samples, [25] demonstrated that temperature scaling does not work well under data distributional shift. [30] mitigated this limitation by transforming the validation set before performing the post-hoc calibration step. In [20], a ranking model was introduced to improve the post-processing model calibration, whereas [5] used a simple regression model to predict the temperature parameter during the inference phase.…”

Section: Related Workmentioning

confidence: 99%

“…Quantifying the predictive uncertainty for modern DNNs has received an increased attention recently, with a variety of alternatives to better calibrate network outputs. A simple strategy consists in including a post-processing step during the test phase to transform the output of a trained network [5,8,30,34], with the parameters of this additional operation determined on a validation set. Despite their simplicity and low computational cost, these methods were shown to be effective when training and testing data are drawn from the same distribution.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration

Liu¹,

Ayed²,

Galdrán³

et al. 2021

Preprint

View full text Add to dashboard Cite

In spite of the dominant performances of deep neural networks, recent works have shown that they are poorly calibrated, resulting in over-confident predictions. Miscalibration can be exacerbated by overfitting due to the minimization of the cross-entropy during training, as it promotes the predicted softmax probabilities to match the one-hot label assignments. This yields a pre-softmax activation of the correct class that is significantly larger than the remaining activations. Recent evidence from the literature suggests that loss functions that embed implicit or explicit maximization of the entropy of predictions yield state-of-the-art calibration performances. We provide a unifying constrainedoptimization perspective of current state-of-the-art calibration losses. Specifically, these losses could be viewed as approximations of a linear penalty (or a Lagrangian) imposing equality constraints on logit distances. This points to an important limitation of such underlying equality constraints, whose ensuing gradients constantly push towards a non-informative solution, which might prevent from reaching the best compromise between the discriminative performance and calibration of the model during gradient-based optimization. Following our observations, we propose a simple and flexible generalization based on inequality constraints, which imposes a controllable margin on logit distances. Comprehensive experiments on a variety of image classification, semantic segmentation and NLP benchmarks demonstrate that our method sets novel state-of-the-art results on these tasks in terms of network calibration, without affecting the discriminative performance. The code is available at https://github.com/by-liu/MbLS .

show abstract

Improving the Reliability for Confidence Estimation

Foo

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Confidence estimation, a task that aims to evaluate the trustworthiness of the model's prediction output during deployment, has received lots of research attention recently, due to its importance for the safe deployment of deep models. Previous works have outlined two important qualities that a reliable confidence estimation model should possess, i.e., the ability to perform well under label imbalance and the ability to handle various out-of-distribution data inputs. In this work, we propose a meta-learning framework that can simultaneously improve upon both qualities in a confidence estimation model. Specifically, we first construct virtual training and testing sets with some intentionally designed distribution differences between them. Our framework then uses the constructed sets to train the confidence estimation model through a virtual training and testing scheme leading it to learn knowledge that generalizes to diverse distributions. We show the effectiveness of our framework on both monocular depth estimation and image classification.

show abstract

Post-hoc Uncertainty Calibration for Domain Drift Scenarios

Cited by 39 publications

References 4 publications

A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration

A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration

The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration

Improving the Reliability for Confidence Estimation

Contact Info

Product

Resources

About