2023
DOI: 10.48550/arxiv.2301.12187
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming

Abstract: Recent works on neural network pruning advocate that reducing the depth of the network is more effective in reducing run-time memory usage and accelerating inference latency than reducing the width of the network through channel pruning. In this regard, some recent works propose depth compression algorithms that merge convolution layers. However, the existing algorithms have a constricted search space and rely on human-engineered heuristics. In this paper, we propose a novel depth compression algorithm which t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 14 publications
(21 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?