Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Hendrycks, Dan; Dietterich, Thomas G.

doi:10.48550/arxiv.1903.12261

Cited by 281 publications

(481 citation statements)

References 23 publications

Supporting

Mentioning

481

Contrasting

Order By: Relevance

“…The robustness and out-of-distribution (OOD) generalization abilities of RELICv2 representations are tested on several detasets. We use ImageNetV2 (Recht et al, 2019) and ImageNet-C (Hendrycks and Dietterich, 2019) datasets to evaluate robustness. ImageNetV2 (Recht et al, 2019) has three sets of 10000 images that were collected to have a similar distribution to the original ImageNet validation set, while ImageNet-C (Hendrycks and Dietterich, 2019) consists of 15 synthetically generated corruptions (e.g.…”

Section: B5 Robustness and Ood Generalizationmentioning

confidence: 99%

“…We use ImageNetV2 (Recht et al, 2019) and ImageNet-C (Hendrycks and Dietterich, 2019) datasets to evaluate robustness. ImageNetV2 (Recht et al, 2019) has three sets of 10000 images that were collected to have a similar distribution to the original ImageNet validation set, while ImageNet-C (Hendrycks and Dietterich, 2019) consists of 15 synthetically generated corruptions (e.g. blur, noise) that are added to the ImageNet validation set.…”

Section: B5 Robustness and Ood Generalizationmentioning

confidence: 99%

See 1 more Smart Citation

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

Tomašev¹,

Bica²,

McWilliams³

et al. 2022

Preprint

View full text Add to dashboard Cite

Despite recent progress made by self-supervised methods in representation learning with residual networks, they still underperform supervised learning on the ImageNet classification benchmark, limiting their applicability in performancecritical settings. Building on prior theoretical insights (Mitrovic et al., 2021) we propose RELICv2 which combines an explicit invariance loss with a contrastive objective over a varied set of appropriately constructed data views. RELICv2 achieves 77.1% top-1 classification accuracy on ImageNet using linear evaluation with a ResNet50 architecture and 80.6% with larger ResNet models, outperforming previous state-ofthe-art self-supervised approaches by a wide margin. Most notably, RELICv2 is the first representation learning method to consistently outperform the supervised baseline in a like-for-like comparison using a range of standard ResNet architectures. Finally we show that despite using ResNet encoders, RELICv2 is comparable to state-of-theart self-supervised vision transformers.

show abstract

Section: B5 Robustness and Ood Generalizationmentioning

confidence: 99%

Section: B5 Robustness and Ood Generalizationmentioning

confidence: 99%

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

Tomašev¹,

Bica²,

McWilliams³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Deep neural networks are known to be vulnerable to adversarial examples and common corruptions (Bulusu et al, 2020). Hendrycks & Dietterich (2019); Hendrycks et al (2021) developed corruption robustness benchmarking datasets CIFAR-10/100-C, ImageNet-C, and ImageNet-R to facilitate robustness evaluations of CIFAR and ImageNet classification models. Michaelis et al (2019) extended this benchmark to object detection models.…”

Section: Related Workmentioning

confidence: 99%

“…To bridge this gap, we design 15 common corruptions for benchmarking corruption robustness of point cloud recognition models. It is worth noting that such designs are non-trivial since the manipulation space of 3D point clouds is completely different from 2D images where the corruptions come from the RGB modification (Hendrycks & Dietterich, 2019). In particular, we have three principles to design our benchmarks: i) Since we directly manipulate the position of points, we need to take extra care to preserve the original semantics of point clouds (Fig.…”

Section: D Point Cloud Corruption Robustnessmentioning

confidence: 99%

See 1 more Smart Citation

Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions

Sun¹,

Zhang²,

Kailkhura³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep neural networks on 3D point cloud data have been widely used in the real world, especially in safety-critical applications. However, their robustness against corruptions is less studied. In this paper, we present ModelNet40-C, the first comprehensive benchmark on 3D point cloud corruption robustness, consisting of 15 common and realistic corruptions. Our evaluation shows a significant gap between the performances on ModelNet40 and ModelNet40-C for state-of-the-art (SOTA) models. To reduce the gap, we propose a simple but effective method by combining PointCutMix-R and TENT after evaluating a wide range of augmentation and testtime adaptation strategies. We identify a number of critical insights for future studies on corruption robustness in point cloud recognition. For instance, we unveil that Transformer-based architectures with proper training recipes achieve the strongest robustness. We hope our in-depth analysis will motivate the development of robust training strategies or architecture designs in the 3D point cloud domain. Our codebase and dataset are included in https://github. com/jiachens/ModelNet40-C.

show abstract

Using Machine Learning to Individualize Treatment Effect Estimation: Challenges and Opportunities

Curth,

Peck,

McKinney

et al. 2024

Clin Pharma and Therapeutics

View full text Add to dashboard Cite

The use of data from randomized clinical trials to justify treatment decisions for real world patients is the current state of the art. It relies on the assumption that average treatment effects from the trial can be extrapolated to patients with personal and/or disease characteristics different from those treated in the trial. Yet, because of heterogeneity of treatment effects between patients and between the trial population and real‐world patients, this assumption may not be correct for many patients. Using machine learning to estimate the expected Conditional Average Treatment Effect (CATE) in individual patients from observational data offers the potential for more accurate estimation of the expected treatment effects in each patient based on their observed characteristics. In this review we discuss some of the challenges and opportunities for machine learning to estimate CATE, including ensuring data identification assumptions are met, managing covariate shift and learning without access to the true label of interest. We also discuss the potential applications as well as future work and collaborations needed to further improve identification and utilization of CATE estimates to increase patient benefit.This article is protected by copyright. All rights reserved.

show abstract

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Cited by 281 publications

References 23 publications

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions

Using Machine Learning to Individualize Treatment Effect Estimation: Challenges and Opportunities

Contact Info

Product

Resources

About