2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00638
|View full text |Cite
|
Sign up to set email alerts
|

Network Quantization with Element-wise Gradient Scaling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
77
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 78 publications
(77 citation statements)
references
References 16 publications
0
77
0
Order By: Relevance
“…Model quantization: Besides clustering and regularization methods, model quantization can also reduce the model size, and training-time quantization techniques have been developed to improve the accuracy of quantized models. EWGS [9] adjusts gradients by scaling them up or down based on the Hessian approximation for each layer. PROFIT [12] adopts an iterative process and freezes layers based on the activation instability.…”
Section: Model Compression Using Regularizationmentioning
confidence: 99%
See 4 more Smart Citations
“…Model quantization: Besides clustering and regularization methods, model quantization can also reduce the model size, and training-time quantization techniques have been developed to improve the accuracy of quantized models. EWGS [9] adjusts gradients by scaling them up or down based on the Hessian approximation for each layer. PROFIT [12] adopts an iterative process and freezes layers based on the activation instability.…”
Section: Model Compression Using Regularizationmentioning
confidence: 99%
“…• Instead of hard weight-cluster assignment and approximated gradient [6,9,19,26], DKM uses flexible and differentiable attention-based weight clustering and computes gradients w.r.t the task loss without approximation.…”
Section: Differentiable K-means Clustering Layer For Weight-clusteringmentioning
confidence: 99%
See 3 more Smart Citations