On the Role of Structured Pruning for Neural Network Compression

Bragagnolo, Andrea; Tartaglione, Enzo; Fiandrotti, Attilio; Grangetto, Marco

doi:10.1109/icip42928.2021.9506708

Cited by 6 publications

(4 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pruning In recent years, the focus of NNs pruning, shifted from unstructured (parameters are removed independently) to structured (entire neurons are zeroed-out). A recent work [18] even showed that structured procedures greatly benefit endto-end compression and the deployment on embedded devices. Following this trend, for our experiments, we used the structured pruning strategy described in [19].…”

Section: Methodsmentioning

confidence: 99%

Efficient Inference Of Image-Based Neural Network Models In Reconfigurable Systems With Pruning And Quantization

Flich

Medina

Catalan

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

Neural networks (NN) for image processing in embedded systems expose two conflicting requirements: increasing computing power needs as models become more complex and constrained resource budget. In order to alleviate this problems, model compression based on quantization and pruning techniques are common. Derived models then need to fit on reconfigurable systems such as FPGAs for the embedded system to work properly. In this paper, we present HLSinf, an open source framework for the development of custom NN accelerators for FPGAs which provides efficient support to quantized and pruned NN models. With HLSinf, significant inference speedups can be obtained for typical medical image-based applications. In particular, we obtain up to 90x speedup factor when we combine quantization/pruning with the flexibility of HLSinf compared to CPU.

show abstract

Section: Methodsmentioning

confidence: 99%

Efficient Inference Of Image-Based Neural Network Models In Reconfigurable Systems With Pruning And Quantization

Flich

Medina

Catalan

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

show abstract

“…As an effect, they minimize the cardinality of some i-th intermediate layer's output ∥x i ∥ 0 . Bragagnolo et al [20] showed that structured sparsity, despite removing significantly less parameters from the model, yields lower model's memory footprint and inference time. When pruning a network in a structured way, a simplification step which practically reduces the rank of the matrices is possible; on the other side, encoding unstructured sparse matrices lead to representation overheads [10].…”

Section: Effect Of Pruned Backbones To Capsule Layersmentioning

confidence: 99%

Towards Efficient Capsule Networks

Renzulli

Grangetto

2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

From the moment Neural Networks dominated the scene for image processing, the computational complexity needed to solve the targeted tasks skyrocketed: against such an unsustainable trend, many strategies have been developed, ambitiously targeting performance's preservation. Promoting sparse topologies, for example, allows the deployment of deep neural networks models on embedded, resourceconstrained devices. Recently, Capsule Networks were introduced to enhance explainability of a model, where each capsule is an explicit representation of an object or its parts. These models show promising results on toy datasets, but their low scalability prevents deployment on more complex tasks. In this work, we explore sparsity besides capsule representations to improve their computational efficiency by reducing the number of capsules. We show how pruning with Capsule Network achieves high generalization with less memory requirements, computational effort, and inference and training time.

show abstract

“…Can they be determined earlier, during a normal vanilla training? deployment time and making inference more efficient [7,8,9,10,11]. A recent work, the lottery ticket hypothesis [12], suggests that the fate of a parameter, namely whether it is useful for training (winner at the lottery of initialization) or if it can be removed from the architecture, is decided already at the initialization step.…”

Section: Vanilla Trainingmentioning

confidence: 99%

“…return M 11: end procedure where u i is some generic update term. In principle, the parameters in W are not in the model, and for instance they should not be included in the computation anymore; however, we still need to encode that are missing, producing an overhead, as they are removed in an unstructured way [9]. 2 Limits.…”

Section: The Lottery Of the Initializationmentioning

confidence: 99%

The Rise of the Lottery Heroes: Why Zero-Shot Pruning is Hard

Tartaglione

2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

Recent advances in deep learning optimization showed that just a subset of parameters are really necessary to successfully train a model. Potentially, such a discovery has broad impact from the theory to application; however, it is known that finding these trainable sub-network is a typically costly process. This inhibits practical applications: can the learned sub-graph structures in deep learning models be found at training time? In this work we explore such a possibility, observing and motivating why common approaches typically fail in the extreme scenarios of interest, and proposing an approach which potentially enables training with reduced computational effort. The experiments on either challenging architectures and datasets suggest the algorithmic accessibility over such a computational gain, and in particular a trade-off between accuracy achieved and training complexity deployed emerges.

show abstract

On the Role of Structured Pruning for Neural Network Compression

Cited by 6 publications

References 8 publications

Efficient Inference Of Image-Based Neural Network Models In Reconfigurable Systems With Pruning And Quantization

Efficient Inference Of Image-Based Neural Network Models In Reconfigurable Systems With Pruning And Quantization

Towards Efficient Capsule Networks

The Rise of the Lottery Heroes: Why Zero-Shot Pruning is Hard

Contact Info

Product

Resources

About