2021
DOI: 10.48550/arxiv.2111.13824
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Abstract: Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments. However, most existing quantization methods have been developed and tested mainly on Convolutional Neural Networks (CNN), and suffer severe degradation when applied to Transformer-based architectures. In this work, we present a systematic method to reduce the performance degradation and inference complexity of Quantized Transformers. In particular, we propose Powers-of-Two Scale (PTS) to de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 28 publications
0
7
0
Order By: Relevance
“…FQ-ViT [43] adopts log2 quantization (see eq.9) and assigns more bins to the frequently occurring small values found in post-softmax activation (attention maps), in contrast to the 4-bit uniform quantization, which allocates only one bin for these values.…”
Section: A Activation Quantization Optimizationmentioning
confidence: 99%
See 4 more Smart Citations
“…FQ-ViT [43] adopts log2 quantization (see eq.9) and assigns more bins to the frequently occurring small values found in post-softmax activation (attention maps), in contrast to the 4-bit uniform quantization, which allocates only one bin for these values.…”
Section: A Activation Quantization Optimizationmentioning
confidence: 99%
“…2) Post-LayerNorm Activation: FQ-ViT [43] proposes the Power-of-Two Factor (PTF) for Pre-LayerNorm quantization. The core idea of PTF is to assign different factors to different channels instead of different quantization parameters.…”
Section: A Activation Quantization Optimizationmentioning
confidence: 99%
See 3 more Smart Citations