Efficient Structured Pruning and Architecture Searching for Group Convolution

Zhao, Ruizhe; Luk, Wayne

doi:10.1109/iccvw.2019.00245

Cited by 9 publications

(7 citation statements)

References 33 publications

(75 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Weight pruning selects individual weights to be pruned. Because of the unstructured selection patterns, it requires customized GPU kernels or specialized hardware [17,58]. Filter pruning selects entire filters only.…”

Section: Dnn Compression Methodsmentioning

confidence: 99%

A Highly Effective Low-Rank Compression of Deep Neural Networks with Modified Beam-Search and Modified Stable Rank

Eo¹,

Kang²,

Rhee³

2021

Preprint

View full text Add to dashboard Cite

Compression has emerged as one of the essential deep learning research topics, especially for the edge devices that have limited computation power and storage capacity. Among the main compression techniques, low-rank compression via matrix factorization has been known to have two problems. First, an extensive tuning is required. Second, the resulting compression performance is typically not impressive. In this work, we propose a low-rank compression method that utilizes a modified beam-search for an automatic rank selection and a modified stable rank for a compression-friendly training. The resulting BSR (Beamsearch and Stable Rank) algorithm requires only a single hyperparameter to be tuned for the desired compression ratio. The performance of BSR in terms of accuracy and compression ratio trade-off curve turns out to be superior to the previously known low-rank compression methods. Furthermore, BSR can perform on par with or better than the state-of-the-art structured pruning methods. As with pruning, BSR can be easily combined with quantization for an additional compression.

show abstract

Section: Dnn Compression Methodsmentioning

confidence: 99%

A Highly Effective Low-Rank Compression of Deep Neural Networks with Modified Beam-Search and Modified Stable Rank

Eo¹,

Kang²,

Rhee³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Although this method is flexible, it relies heavily on a learnable binary relationship matrix U, which introduces additional parameters that complicate the training of deep convolution neural networks. Ruizhe et al [24] formulated group-based convolution pruning as a channel permutation optimization problem and efficiently solved channel structural constraints using a heuristic algorithm. However, it was difficult to be determine the structural constraints of each convolutional layer.…”

Section: Related Workmentioning

confidence: 99%

“…To overcome this shortcoming, Zhuo et al [23] proposed a dynamic group convolution (DGC), in which each group is decided dynamically by a tiny auxiliary feature selector. Ruizhe et al [24] formulated group-based convolution pruning as a channel permutation optimization problem and solved channel structural constraints efficiently using a heuristic algorithm. However, it was difficult to determine the structural constraints of each convolutional layer.…”

Section: Introductionmentioning

confidence: 99%

Automatic Group-Based Structured Pruning for Deep Convolutional Networks

Wei

Wang

Hua³

et al. 2022

IEEE Access

View full text Add to dashboard Cite

Structured pruning methods have been used in several convolutional neural networks (CNNs). However, group-based structured pruning is a challenging task. In previous methods, the number of groups is manually determined for all layers, which is suboptimal. Moreover, which kernels should be appropriately removed? Model accuracy may be significantly reduced when the number of kernels is removed. To address these challenges, we propose an automatic group-based structured pruning method with reinforcement learning, named AGSPRL, which can generate pruned models with different compression rates automatically. We first develop a reinforcement learning (RL) framework to learn the pruning rate for group-based channel pruning layer by layer. Then, based on the learned kernel pruning rate, we propose an efficient group configuration algorithm to adaptively determine the number of groups for each convolution layer. Finally, we introduce a channel pruning method with an attention mechanism as a tiny auxiliary filter selector for each group to dynamically determine which part of the kernels should be selected into the group convolution and which part of the kernels should be removed. To demonstrate the efficiency of our method, we apply it to a variety of CNNs in classification and detection datasets. The experimental results show that the AGSPRL not only adaptively but also accurately configures the number of groups. The accuracy is reduced by less than 1% and improved by 1%. Moreover, compared to other state-of-the-art methods, AGSPRL is more effective and has less accuracy loss.

show abstract

“…Conceptually, model compression shares a similar formulation to NAS, i.e., the generalized formulation in Section 2.1 directly applies with either a regularization term for model complexity or a hard constraint for the maximal resource. Therefore, NAS approaches are often easily transferred for model compression [358], [417], including pruning [244], [359], [360], [364], [418], [419], quantization [224], [420], [421], [422], and joint optimization [423], [424], [425]. Sometimes, the searched configuration or connectivity mask 4.…”

Section: Model Compressionmentioning

confidence: 99%

Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap

Xie¹,

Chen

et al. 2020

Preprint

View full text Add to dashboard Cite

Neural architecture search (NAS) has attracted increasing attentions in both academia and industry. In the early age, researchers mostly applied individual search methods which sample and evaluate the candidate architectures separately and thus incur heavy computational overheads. To alleviate the burden, weight-sharing methods were proposed in which exponentially many architectures share weights in the same super-network, and the costly training procedure is performed only once. These methods, though being much faster, often suffer the issue of instability. This paper provides a literature review on NAS, in particular the weight-sharing methods, and points out that the major challenge comes from the optimization gap between the super-network and the sub-architectures. From this perspective, we summarize existing approaches into several categories according to their efforts in bridging the gap, and analyze both advantages and disadvantages of these methodologies. Finally, we share our opinions on the future directions of NAS and AutoML. Due to the expertise of the authors, this paper mainly focuses on the application of NAS to computer vision problems and may bias towards the work in our group.

show abstract

Efficient Structured Pruning and Architecture Searching for Group Convolution

Cited by 9 publications

References 33 publications

A Highly Effective Low-Rank Compression of Deep Neural Networks with Modified Beam-Search and Modified Stable Rank

A Highly Effective Low-Rank Compression of Deep Neural Networks with Modified Beam-Search and Modified Stable Rank

Automatic Group-Based Structured Pruning for Deep Convolutional Networks

Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap

Contact Info

Product

Resources

About