2021
DOI: 10.48550/arxiv.2103.14586
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Understanding Robustness of Transformers for Image Classification

Abstract: Deep Convolutional Neural Networks (CNNs) have long been the architecture of choice for computer vision tasks. Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classification. However, details of the Transformer architecture -such as the use of non-overlapping patches-lead one to wonder whether these networks are as robust. In this paper, we perform an extensive study of a variety of different measures of robustness of ViT models and compa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

7
60
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 30 publications
(69 citation statements)
references
References 34 publications
7
60
2
Order By: Relevance
“…tible noises to original data [1,22,32]. The characteristics of using non-overlapping patches in ViTs reduces the influence of adversarial examples with the same magnitude on the overall results [2].…”
Section: Boundary Heatmap Par Heatmapmentioning
confidence: 99%
See 4 more Smart Citations
“…tible noises to original data [1,22,32]. The characteristics of using non-overlapping patches in ViTs reduces the influence of adversarial examples with the same magnitude on the overall results [2].…”
Section: Boundary Heatmap Par Heatmapmentioning
confidence: 99%
“…In the image classification task, the decision-based attacks [4,10,34] start from a ran-dom noise with a large noise magnitude, randomly sample in the image input space, and gradually compress the adversarial noise under the premise of ensuring misclassification. The existing adversarial attacks against transformers are only white-box attacks [2,26] and transfer-based blackbox attacks [21,33]. Black-box attacks against ViTs without substitute model remains an open problem.…”
Section: Boundary Heatmap Par Heatmapmentioning
confidence: 99%
See 3 more Smart Citations