Proceedings of the 49th Annual International Symposium on Computer Architecture 2022
DOI: 10.1145/3470496.3527419
|View full text |Cite
|
Sign up to set email alerts
|

Cascading structured pruning

Abstract: Performance and efficiency of running modern Deep Neural Networks (DNNs) are heavily bounded by data movement. To mitigate the data movement bottlenecks, recent DNN inference accelerator designs widely adopt aggressive compression techniques and sparse-skipping mechanisms. These mechanisms avoid transferring or computing with zero-valued weights or activations to save time and energy. However, such sparse-skipping logic involves large input buffers and irregular data access patterns, thus precluding many energ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(1 citation statement)
references
References 29 publications
0
0
0
Order By: Relevance
“…Structured pruning reduces computational complexity, simplifies sparse matrix computations, and is easier to use across different deep learning frameworks. Consequently, recent research has been inclined towards employing structured pruning algorithms for model pruning [62][63][64][65][66][67].…”
Section: Analysis and Discussionmentioning
confidence: 99%
“…Structured pruning reduces computational complexity, simplifies sparse matrix computations, and is easier to use across different deep learning frameworks. Consequently, recent research has been inclined towards employing structured pruning algorithms for model pruning [62][63][64][65][66][67].…”
Section: Analysis and Discussionmentioning
confidence: 99%