2019
DOI: 10.48550/arxiv.1908.08413
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Saliency Methods for Explaining Adversarial Attacks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
4

Relationship

3
7

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 10 publications
1
12
0
Order By: Relevance
“…In this work, we propose to accomplish this by leveraging a common understanding of adversarial training. That is, the ground-truth predictions are stabilized during training and thus may have a smoother loss surface and smaller local Lipschitzness Ross and Doshi-Velez (2018); Cissé et al (2017); Chan et al (2020); Gu and Tresp (2019); Wu et al (2020a). This protection on the ground-truth class broadly exists in the vicinity of natural examples as well as the adversarial ones.…”
Section: Introductionmentioning
confidence: 99%
“…In this work, we propose to accomplish this by leveraging a common understanding of adversarial training. That is, the ground-truth predictions are stabilized during training and thus may have a smoother loss surface and smaller local Lipschitzness Ross and Doshi-Velez (2018); Cissé et al (2017); Chan et al (2020); Gu and Tresp (2019); Wu et al (2020a). This protection on the ground-truth class broadly exists in the vicinity of natural examples as well as the adversarial ones.…”
Section: Introductionmentioning
confidence: 99%
“…These intentionally perturbed inputs trick the attacked models into misclassifications. Indeed, explanation methods have already proven helpful in understanding adversarial examples and dataset-related issues in the context of image analysis [7,8], and have the potential to provide similar insights in support of recent analyses on audio adversarial examples for ASR [9,10,11,12,13].…”
Section: Introductionmentioning
confidence: 80%
“…In this section, we visualize and analyze models' attention to understand the different robustness performance of DeiT and ResNet against patch-wise perturbations. Although there are many existing methods, e.g., [14,15,34,36,44], designed for CNNs to generate saliency maps, it is not clear yet how suitable to generalize them to vision transformers. Therefore, we follow [21] to choose the model-agnostic vanilla gradient visualization method to compare the gradient (saliency) map [43] of DeiT and ResNet.…”
Section: Understanding the Robustness Of Vit To Patch-wise Perturbationsmentioning
confidence: 99%