2020
DOI: 10.48550/arxiv.2002.00523
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Automatic Pruning for Quantized Neural Networks

Abstract: Neural network quantization and pruning are two techniques commonly used to reduce the computational complexity and memory footprint of these models for deployment. However, most existing pruning strategies operate on full-precision and cannot be directly applied to discrete parameter distributions after quantization. In contrast, we study a combination of these two techniques to achieve further network compression. In particular, we propose an effective pruning strategy for selecting redundant low-precision f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 23 publications
0
6
0
Order By: Relevance
“…MultiMNIST: Our model shows a comparable or even better accuracy with a substantial decrease in the number of parameters. ResNet_XnIDR achieves the best Top-2 accuracy, 99.37%, on MultiMNIST [34] Aff-CapsNets [12] 76.28 CapsNetSIFT [25] 91.27 HGCNet-91 [40] 94.47 Ternary connect + Quantized backprop [24] 87.99 Greedy Algorithm for Quantizing [28] 88.88 SLB on ResNet20 [43] 92.1 SLB on VGG small [43] 94.1 DoReFa-Net on VGG-11 [13] 86.30 DoReFa-Net on ResNet14 [13] 89.84…”
Section: Experiments Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…MultiMNIST: Our model shows a comparable or even better accuracy with a substantial decrease in the number of parameters. ResNet_XnIDR achieves the best Top-2 accuracy, 99.37%, on MultiMNIST [34] Aff-CapsNets [12] 76.28 CapsNetSIFT [25] 91.27 HGCNet-91 [40] 94.47 Ternary connect + Quantized backprop [24] 87.99 Greedy Algorithm for Quantizing [28] 88.88 SLB on ResNet20 [43] 92.1 SLB on VGG small [43] 94.1 DoReFa-Net on VGG-11 [13] 86.30 DoReFa-Net on ResNet14 [13] 89.84…”
Section: Experiments Resultsmentioning
confidence: 99%
“…Touvron et al presented a weight searching algorithm to search for discrete weights and avoid gradient estimation and non-differentiable problems to improve the accuracy during training the quantized deep neural network [43]. Wang et al proposed a pruning algorithm to point out unnecessary low-precision filters and utilize Bayesian optimization to decide the pruning ratio [13]. These papers are very good, but less revolutionary than [11], [16], and [24].…”
Section: Xnor Networkmentioning
confidence: 99%
“…And Ref. [28] focuses on the gradient part, which scales the gradient according to the position of the weight vector, making it easier to compress. Reference [29] explores how to automatically do pruning when quantization.…”
Section: Pruning and Quantization Mixed Compression Methodsmentioning
confidence: 99%
“…However, there is a trade-off between accuracy and pruning; accuracy may decrease when the pruning rates increase. In [111], the authors utilize Bayesian optimization for channel pruning for quantized neural networks. That pruning approach based on the angle preservation feature of high dimensional binary vectors [112] and the euclidean distance.…”
Section: B: Pruningmentioning
confidence: 99%