2020
DOI: 10.1007/978-3-030-50420-5_34
|View full text |Cite
|
Sign up to set email alerts
|

Retrain or Not Retrain? - Efficient Pruning Methods of Deep CNN Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(20 citation statements)
references
References 6 publications
0
20
0
Order By: Relevance
“…However, with sufficient pruning, the networks will eventually suffer large declines in performance. To mitigate this, the networks can be retrained, such as after pruning or over the course of progressive pruning (Mittal et al, 2019;Marcin and Maciej, 2020). It has been shown that this retraining allows the removal of a substantially larger number of connections while retaining comparable performance.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, with sufficient pruning, the networks will eventually suffer large declines in performance. To mitigate this, the networks can be retrained, such as after pruning or over the course of progressive pruning (Mittal et al, 2019;Marcin and Maciej, 2020). It has been shown that this retraining allows the removal of a substantially larger number of connections while retaining comparable performance.…”
Section: Discussionmentioning
confidence: 99%
“…However, it remains to be seen whether this better recapitulates the slower manifestation of cognitive deficits seen clinically (Fox et al, 1999;Zarei et al, 2013). In contrast to random weight ablation, ablation of connections based on their strengths such as in network pruning (Han et al, 2015;Mittal et al, 2019;Marcin and Maciej, 2020), or specifically targeting excitatory (positive) or inhibitory (negative) connections (Song et al, 2016;Mackwood et al, 2021) may be instructive.…”
Section: Discussionmentioning
confidence: 99%
“…There are many different approaches to efficiently compress neural network models without too much accuracy degradation [14,15,41,42]. Typically, pruned network accuracy suffers a sharp accuracy degradation unless the pruned network structure is re-trained with the training data [14,41,43]. Quantization suffers less from this problem, and there are many different approaches, including simply quantizing weights after training ( post-training quantization), re-training after quantization, and more [44][45][46].…”
Section: Neural Network Compressionmentioning
confidence: 99%
“…Values can be quantized from the typically 4-byte floating-point values to 8-bit integers [47], or even more aggressively to ternary [48,49] or binary [50,51] values with varying accuracy loss trade-offs. Quantization is typically more readily available on embedded neural networks compared to pruning since pruning can introduce sparsity in the weights, resulting in complex random access [43,52].…”
Section: Neural Network Compressionmentioning
confidence: 99%
“…In quantization, the neural network weights and/or the feature maps are expressed by using shorter data types, such as FP16, INT16, or INT8 instead of FP32 [6]; this leads to a lower memory footprint as well as to a lower latency as the computation cost is reduced and the SIMD instructions can be used to calculate more operations per instruction. In weight pruning [7], neurons with small saliency (sensitivity) are removed, resulting in a sparse computational graph [8]; neurons with small saliency are those whose removal minimally affects the model output/loss function.…”
Section: Related Workmentioning
confidence: 99%