2019
DOI: 10.48550/arxiv.1905.02175
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Adversarial Examples Are Not Bugs, They Are Features

Abstract: Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features (derived from patterns in the data distribution) that are highly predictive, yet brittle and (thus) incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datase… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

11
225
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 142 publications
(237 citation statements)
references
References 17 publications
11
225
1
Order By: Relevance
“…Also, we show that these methods consistently achieve higher standard accuracy (i.e., non adversarial accuracy), than the nominal neural networks trained without robustness. While this result is not true for a general choice of uncertainty set (see for example Ilyas et al (2019)), we observe that when the uncertainty set has the appropriate size it can significantly improve the classification performance of the network, which is consistent with the results obtained for other classification models like Support Vector Machines, Logistic Regression and Classification Trees (Bertsimas et al, 2019).…”
Section: Introductionsupporting
confidence: 87%
“…Also, we show that these methods consistently achieve higher standard accuracy (i.e., non adversarial accuracy), than the nominal neural networks trained without robustness. While this result is not true for a general choice of uncertainty set (see for example Ilyas et al (2019)), we observe that when the uncertainty set has the appropriate size it can significantly improve the classification performance of the network, which is consistent with the results obtained for other classification models like Support Vector Machines, Logistic Regression and Classification Trees (Bertsimas et al, 2019).…”
Section: Introductionsupporting
confidence: 87%
“…Some defenses focus on masking the computational process of the model, for example through non-differentiable layers [35]. Ilyas et al [10] claim that NNs do learn to classify correctly based on their training set, and that their vulnerability reflects higherorder features that exist in the dataset and are not accessible to humans. Therefore, their suggested defense was to train on special datasets that do not contain such features.…”
Section: B Defense Methodsmentioning
confidence: 99%
“…AEs may be robust to physical transformations [5] and therefore pose a security breach not only for cyber-world applications [6] but also for real-world systems, such as computer vision of autonomous vehicles [7]. Although various defense techniques have been suggested, none of them can ensure complete effective defense in all settings [8]- [10].…”
Section: Introductionmentioning
confidence: 99%
“…Shortcut learning. Recently, the community has realized that deep models may rely on shortcuts to make decisions [Beery et al, 2018, Niven and Kao, 2019, Ilyas et al, 2019, Geirhos et al, 2020, Huh et al, 2021. Shortcuts are spurious features that are correlated with training labels but do not generalize on test data.…”
Section: Related Workmentioning
confidence: 99%