2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) 2019
DOI: 10.1109/iccvw.2019.00360
|View full text |Cite
|
Sign up to set email alerts
|

VACL: Variance-Aware Cross-Layer Regularization for Pruning Deep Residual Networks

Abstract: Improving weight sparsity is a common strategy for producing light-weight deep neural networks. However, pruning models with residual learning is more challenging. In this paper, we introduce Variance-Aware Cross-Layer (VACL), a novel approach to address this problem. VACL consists of two parts, a Cross-Layer grouping and a Variance Aware regularization. In Cross-Layer grouping the i th filters of layers connected by skip-connections are grouped into one regularization group. Then, the Variance-Aware regulariz… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 33 publications
0
5
0
Order By: Relevance
“…Lemaire et al [34] use a mixed block connectivity to avoid redundant computation, which can be treated as a subset of our method. Recently, a group pruning strategy [21], [22], [31], [35] is proposed to assign those layers into the same group, thus filters in the same group can be pruned simultaneously. Unfortunately, pruning in a group strategy requires that all connections of one pruned filter should be removed simultaneously, of which the constraint is so strong that limits the pruning performance at especially high pruning ratios.…”
Section: B Pruning Structurementioning
confidence: 99%
See 1 more Smart Citation
“…Lemaire et al [34] use a mixed block connectivity to avoid redundant computation, which can be treated as a subset of our method. Recently, a group pruning strategy [21], [22], [31], [35] is proposed to assign those layers into the same group, thus filters in the same group can be pruned simultaneously. Unfortunately, pruning in a group strategy requires that all connections of one pruned filter should be removed simultaneously, of which the constraint is so strong that limits the pruning performance at especially high pruning ratios.…”
Section: B Pruning Structurementioning
confidence: 99%
“…The biggest difference between our pruned residual neural network and existing methods is that we consider the problem of pruning from the perspective of gating feature maps, the gate function is induced on each channel while skip connections are always retained. One simple strategy is to prune each channel independently without any constraint among channels or layers, thus a better trade-off between compression ratio and performance should be achieved due to the larger search space of channels than strategies such as group pruning [21], [22] or skipping [19], [20]. However, too much freedom leads to irregular distributions of pruned channels between residual blocks, and during experiments we found the advantage of such strategy over group pruning or skipping is not significant as expected especially for more complex models and tasks.…”
Section: B Fine-grained Pruning For Residual Neural Networkmentioning
confidence: 99%
“…Since MobileNet has group convolutional layers to speedup the inference, we take the group convolutional layer with its preceding connected convolutional layer together as coupled crosslayers Gao et al (2019) to make sure the input channel number and output channel number of the group convolution remain the same. All the 27 convolutional layers can be divided into 14 coupled layers.…”
Section: B Efficacy Of Neuron Grouping On Mobilenetmentioning
confidence: 99%
“…Prune while training methods rest in the middle by finding a trade-off between training efficiency and final accuracy. Literature falls under two streams towards this task: a) regularization-based methods that encourage sparsity during training [3,13,28], and b) sub-ticket selection methods via saliency that discard redundancy [2,14,17]. Our work belongs to the latter given its efficacy to quickly enforce a pruning ratio and ease-of-control during training.…”
Section: Related Workmentioning
confidence: 99%