2022
DOI: 10.48550/arxiv.2202.07471
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

Abstract: Quantization of deep neural networks (DNN) has been proven effective for compressing and accelerating DNN models. Data-free quantization (DFQ) is a promising approach without the original datasets under privacy-sensitive and confidential scenarios. However, current DFQ solutions degrade accuracy, need synthetic data to calibrate networks, and are time-consuming and costly. This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 29 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?