Most state-of-the-art deep neural networks are overparameterized and exhibit a high computational cost. A straightforward approach to this problem is to replace convolutional kernels with its low-rank tensor approximations, whereas the Canonical Polyadic tensor Decomposition is one of the most suited models. However, fitting the convolutional tensors by numerical optimization algorithms often encounters diverging components, i.e., extremely large rank-one tensors but canceling each other. Such degeneracy often causes the non-interpretable result and numerical instability for the neural network ne-tuning. This paper is the first study on degeneracy in the tensor decomposition of convolutional kernels. We present a novel method, which can stabilize the low-rank approximation of convolutional kernels and ensure efficient compression while preserving the highquality performance of the neural networks. We evaluate our approach on popular CNN architectures for image classification and show that our method results in much lower accuracy degradation and provides consistent performance.
This paper proposes a constrained canonical polyadic (CP) tensor decomposition method with low-rank factor matrices. In this way, we allow the CP decomposition with high rank while keeping the number of the model parameters small. First, we propose an algorithm to decompose the tensors into factor matrices of given ranks. Second, we propose an algorithm which can determine the ranks of the factor matrices automatically, such that the fitting error is bounded by a userselected constant. The algorithms are verified on the decomposition of a tensor of the MNIST hand-written image dataset.
Low-rank matrix/tensor decompositions are promising methods for reducing the inference time, computation, and memory consumption of deep neural networks (DNNs). This group of methods decomposes the pre-trained neural network weights through low-rank matrix/tensor decomposition and replaces the original layers with lightweight factorized layers. A main drawback of the technique is that it demands a great amount of time and effort to select the best ranks of tensor decomposition for each layer in a DNN. This paper proposes a Proxy-based Automatic tensor Rank Selection method (PARS) that utilizes a Bayesian optimization approach to find the best combination of ranks for neural network (NN) compression. We observe that the decomposition of weight tensors adversely influences the feature distribution inside the neural network and impairs the predictability of the post-compression DNN performance. Based on this finding, a novel proxy metric is proposed to deal with the abovementioned issue and to increase the quality of the rank search procedure. Experimental results show that PARS improves the results of existing decomposition methods on several representative NNs, including ResNet-18, ResNet-56, VGG-16, and AlexNet. We obtain a 3× FLOP reduction with almost no loss of accuracy for ILSVRC-2012ResNet-18 and a 5.5× FLOP reduction with an accuracy improvement for ILSVRC-2012 VGG-16.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.