2022
DOI: 10.48550/arxiv.2203.02250
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Patch Similarity Aware Data-Free Quantization for Vision Transformers

Abstract: Vision transformers have recently gained great success on various computer vision tasks; nevertheless, their high model complexity makes it challenging to deploy on resource-constrained devices. Quantization is an effective approach to reduce model complexity, and data-free quantization, which can address data privacy and security concerns during model deployment, has received widespread interest. Unfortunately, all existing methods, such as BN regularization, were designed for convolutional neural networks an… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 31 publications
(56 reference statements)
0
4
0
Order By: Relevance
“…Furthermore, ViTs have also been applied to more complexed vision applications, such as object detection [2,27] and semantic segmentation [3]. Despite the promising performance, ViTs' complicated architectures with large memory footprints and computational overheads is intolerable in real-world applications [15,19], especially in time/resource-constrained scenarios. Thus, the compression approaches for ViTs are necessary for practical deployments.…”
Section: Related Work 21 Vision Transformersmentioning
confidence: 99%
See 2 more Smart Citations
“…Furthermore, ViTs have also been applied to more complexed vision applications, such as object detection [2,27] and semantic segmentation [3]. Despite the promising performance, ViTs' complicated architectures with large memory footprints and computational overheads is intolerable in real-world applications [15,19], especially in time/resource-constrained scenarios. Thus, the compression approaches for ViTs are necessary for practical deployments.…”
Section: Related Work 21 Vision Transformersmentioning
confidence: 99%
“…Ranking loss [19] is presented to maintain the correct relative order of the quantized attention map. PSAQ-ViT [15] pushes the quantization of ViTs to data-free scenarios based on patch similarity. To realize the full quantization of ViTs, FQ-ViT [17] introduces quantization strategies for LayerNorm and Softmax.…”
Section: Model Quantizationmentioning
confidence: 99%
See 1 more Smart Citation
“…This is best illustrated on ViT where PowerQuant W4/A8 out performs both DFQ and SQuant even when they are allowed 8 bits for the weights (W8/A8) by a whopping 4.91 points. The proposed PowerQuant even outperforms methods dedicated to transformer quantization such as PSAQLi et al (2022) on every image transformer tested.…”
mentioning
confidence: 95%