2020
DOI: 10.48550/arxiv.2004.10568
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Up or Down? Adaptive Rounding for Post-Training Quantization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(23 citation statements)
references
References 0 publications
0
23
0
Order By: Relevance
“…Post-Training Quantization Post-training quantization [3,1,24,23] needs no training but a subset of dataset for calibrating the quantization parameters, including the clipping threshold and bias correction. Commonly, the quantization parameters of both weight and activation are decided before inference.…”
Section: Quantization Methodsmentioning
confidence: 99%
“…Post-Training Quantization Post-training quantization [3,1,24,23] needs no training but a subset of dataset for calibrating the quantization parameters, including the clipping threshold and bias correction. Commonly, the quantization parameters of both weight and activation are decided before inference.…”
Section: Quantization Methodsmentioning
confidence: 99%
“…Post-training quantization aims to quantize neural networks using a small part of the dataset (in some cases no data at all) for calibration of quantization parameters to ensure a certain local criterion (e.g., correspondence of minimum and maximum, MSE minimality). Recent work [22] showed that minimizing the mean squared error (MSE) introduced in the preactivations might be considered (under certain assumptions) as the best possible local criterion and performed optimization of rounding policy based on it. Works [12] and [28] utilize the same local criterion but optimize weights and quantization parameters directly and employ per-channel weight quantization, thus considering a simplified task.…”
Section: Quantization Methodsmentioning
confidence: 99%
“…At the same time, it does not involve significant overhead in computations. Nevertheless, considerable research efforts were concentrated on eliminating the need for per-channel quantization of weights to simplify the implementation of quantized operations [21,22]. In our work, we investigate the importance of per-channel quantization for GANs.…”
Section: Per-chanel and Per-tensor Weight Quantizationmentioning
confidence: 99%
See 1 more Smart Citation
“…[5,45] even do quantization without accessing any real data. [63,64] adopt intermediate feature-map reconstruction to optimize the rounding policy.…”
Section: Related Workmentioning
confidence: 99%