The computation and memory needed for Convolutional Neural Network (CNN) inference can be reduced by pruning weights from the trained network. Pruning is guided by a pruning saliency, which heuristically approximates the change in the loss function associated with the removal of specific weights. Many pruning signals have been proposed, but the performance of each heuristic depends on the particular trained network. This leaves the data scientist with a difficult choice.We propose a method to compose several primitive pruning saliencies, to exploit the cases where each saliency measure does well. Our experiments show that the composition of saliencies avoids many poor pruning choices identified by individual saliencies. In most cases our method finds better selections than even the best individual pruning saliency.
Pruning unimportant parameters can allow deep neural networks (DNNs) to reduce their heavy computation and memory requirements. A saliency metric estimates which parameters can be safely pruned with little impact on the classification performance of the DNN. Many saliency metrics have been proposed, each within the context of a wider pruning algorithm. The result is that it is difficult to separate the effectiveness of the saliency metric from the wider pruning algorithm that surrounds it. Similar-looking saliency metrics can yield very different results because of apparently minor design choices. We propose a novel taxonomy of saliency metrics based on four mostly-orthogonal principal components. We show that a broad range of metrics from the pruning literature can be grouped according to these components. Our taxonomy serves as a guide to prior work, and allows us to construct new saliency metrics by exploring novel combinations of our taxonomic components. We perform the first in-depth experimental investigation of more than 300 saliency metrics made up of existing techniques and new combinations of components. Our results provide decisive answers to open research questions. In particular, we demonstrate the importance of reduction and scaling when pruning groups of weights. We also propose a novel scaling method based on the number of weights transitively removed. We find that some of our constructed metrics can outperform the best existing state-of-the-art metrics for convolutional neural network channel pruning. We find further that our novel scaling method improves existing saliency metrics.INDEX TERMS machine learning, convolution neural networks, pruning, saliency metric, model compression VOLUME x, 20xx
Channel pruning is used to reduce the number of weights in a Convolutional Neural Network (CNN). Channel pruning removes slices of the weight tensor so that the convolution layer remains dense. The removal of these weight slices from a single layer causes mismatching number of feature maps between layers of the network. A simple solution is to force the number of feature map between layers to match through the removal of weight slices from subsequent layers. This additional constraint becomes more apparent in DNNs with branches where multiple channels need to be pruned together to keep the network dense. Popular pruning saliency metrics do not factor in the structural dependencies that arise in DNNs with branches. We propose Domino metrics (built on existing channel saliency metrics) to reflect these structural constraints. We test Domino saliency metrics against the baseline channel saliency metrics on multiple networks with branches. Domino saliency metrics improved pruning rates in most tested networks and up to 25% in AlexNet on CIFAR-10.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.