2022
DOI: 10.1109/tpami.2021.3083769
|View full text |Cite
|
Sign up to set email alerts
|

Attack to Fool and Explain Deep Networks

Abstract: Deep visual models are susceptible to adversarial perturbations to inputs. Although these signals are carefully crafted, they still appear noise-like patterns to humans. This observation has led to the argument that deep visual representation is misaligned with human perception. We counter-argue by providing evidence of human-meaningful patterns in adversarial perturbations. We first propose an attack that fools a network to confuse a whole category of objects (source class) with a target label. Our attack als… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(14 citation statements)
references
References 69 publications
0
14
0
Order By: Relevance
“…Finally, we summarize physical attacks against image classification ( [17], [86], [90]- [93], [156], [162], [164], [165], [167]- [171]) in Table IV.…”
Section: Black-box Attacksmentioning
confidence: 99%
“…Finally, we summarize physical attacks against image classification ( [17], [86], [90]- [93], [156], [162], [164], [165], [167]- [171]) in Table IV.…”
Section: Black-box Attacksmentioning
confidence: 99%
“…However, the universal perturbation can still be a noise-like pattern to the human eye when the interpretation is applied. The study [10] proposed that extended universal perturbation can exploit the explainability of models by carefully exploring the decision boundaries of deep models. The authors showed that their attack can be used to interpret the internal working process of DL models.…”
Section: Related Workmentioning
confidence: 99%
“…When an interpreter is adopted with a DL model, the involvement of the adversary can easily be detected (see Figure 1). The main idea of our attack is also based on generating universal perturbation [10]. Unlike those attacks, we consider misleading the interpretability along with classification while generating adversarial samples to increase the robustness and stealthiness of the attack.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations