Provable Filter Pruning for Efficient Neural Networks

Liebenwein, Lucas; Baykal, Cenk; Lang, Harry G.; Feldman, Dan; Rus, Daniela

doi:10.48550/arxiv.1911.07412

Cited by 11 publications

(16 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To validate the effectiveness of the proposed method on large-scale image classification datasets, we further apply GFBS on the ImageNet dataset and compare with more methods including [33,35,40]. We prune ResNet-18 and ResNet-50 and provide the results in Table 5.…”

Section: Results On Imagenetmentioning

confidence: 99%

Exploring Gradient Flow Based Saliency for DNN Model Compression

Liu

Li²,

Chen

et al. 2021

Proceedings of the 29th ACM International Conference on Multimedia

View full text Add to dashboard Cite

Model pruning aims to reduce the deep neural network (DNN) model size or computational overhead. Traditional model pruning methods such as ℓ 1 pruning that evaluates the channel significance for DNN pay too much attention to the local analysis of each channel and make use of the magnitude of the entire feature while ignoring its relevance to the batch normalization (BN) and ReLU layer after each convolutional operation. To overcome these problems, we propose a new model pruning method from a new perspective of gradient flow in this paper. Specifically, we first theoretically analyze the channel's influence based on Taylor expansion by integrating the effects of BN layer and ReLU activation function. Then, the incorporation of the first-order Talyor polynomial of the scaling parameter and the shifting parameter in the BN layer is suggested to effectively indicate the significance of a channel in a DNN. Comprehensive experiments on both image classification and image denoising tasks demonstrate the superiority of the proposed novel theory and scheme. Code is available at https://github.com/CityU-AIM-Group/GFBS. CCS CONCEPTS• Computing methodologies → Machine learning; Model compression; • Computer systems organization → Neural networks.

show abstract

Section: Results On Imagenetmentioning

confidence: 99%

Exploring Gradient Flow Based Saliency for DNN Model Compression

Liu

Li²,

Chen

et al. 2021

Proceedings of the 29th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

“…Discussion of other model compression techniques. We implemented our approach using two types of model compression techniques: (1) data-independent techniques: filter pruning [30] and Filter Pruning via Geometric Median (FPGM) [14] that directly calculate the filters' importance and remove unimportant ones; (2) data-dependent techniques: TaylorFOWeight [41], High-Rank Feature Map (HRank) [34], and Provable Filter Pruning (PFP) [32]) that calculate the filters' importance based on training samples. By testing ResNet18 and WideResNet as examples, Figure 14 shows that all techniques achieves similar accuracies under different available memories.…”

Section: Discussionmentioning

confidence: 99%

LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Han,

Zhang,

Liu

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracyresource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, blockgrained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart blocklevel scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-the-art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions. CCS CONCEPTS• Human-centered computing → Ubiquitous and mobile computing; • Computing methodologies → Neural networks.

show abstract

“…Sampling-based Pruning Recently, a number of works (Baykal et al, 2019a;Liebenwein et al, 2019;Baykal et al, 2019b;Mussay et al, 2020), proposed to procedures for pruning networks based on variants of (iterative) random sampling according to certain sensitivity score. These methods can provide concentration bounds on the difference of output between the pruned networks and the full networks, which may yield a bound of…”

Section: Related Workmentioning

confidence: 99%

Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection

Ye,

Gong,

Nie

et al. 2020

Preprint

View full text Add to dashboard Cite

Recent empirical works show that large deep neural networks are often highly redundant and one can find much smaller subnetworks without a significant drop of accuracy. However, most existing methods of network pruning are empirical and heuristic, leaving it open whether good subnetworks provably exist, how to find them efficiently, and if network pruning can be provably better than direct training using gradient descent. We answer these problems positively by proposing a simple greedy selection approach for finding good subnetworks, which starts from an empty network and greedily adds important neurons from the large network. This differs from the existing methods based on backward elimination, which remove redundant neurons from the large network. Theoretically, applying our greedy selection strategy on sufficiently large pre-trained networks guarantees to find small subnetworks with lower loss than networks directly trained with gradient descent. Practically, we improve prior arts of network pruning on learning compact neural architectures on Ima-geNet, including ResNet, MobilenetV2/V3, and ProxylessNet. Our theory and empirical results on MobileNet suggest that we should fine-tune the pruned subnetworks to leverage the information from the large model, instead of re-training from new random initialization as suggested in Liu et al. (2019b).

show abstract

Provable Filter Pruning for Efficient Neural Networks

Cited by 11 publications

References 31 publications

Exploring Gradient Flow Based Saliency for DNN Model Compression

Exploring Gradient Flow Based Saliency for DNN Model Compression

LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision

Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection

Contact Info

Product

Resources

About