2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC) 2019
DOI: 10.1109/hipc.2019.00035
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Sparse Neural Networks Using Regularized Multi Block Sparsity Pattern on a GPU

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…The authors in [63] proposed a generic sparsity pattern termed Regularized Multi Block (RMB) sparsity pattern, an efficient storage format (CRMB), and a fast GPU algorithm for processing the RMBMM (SDMM with the multiplicand having the RMB sparsity pattern). Figure 9 shows the CRMB storage format for storing an RMB sparse matric.…”
Section: Architecture/platform/framework and Strategymentioning
confidence: 99%
“…The authors in [63] proposed a generic sparsity pattern termed Regularized Multi Block (RMB) sparsity pattern, an efficient storage format (CRMB), and a fast GPU algorithm for processing the RMBMM (SDMM with the multiplicand having the RMB sparsity pattern). Figure 9 shows the CRMB storage format for storing an RMB sparse matric.…”
Section: Architecture/platform/framework and Strategymentioning
confidence: 99%
“…The idea of pruning was revived by Han et al 3,4 by simply pruning weights based on their magnitude. To improve run time performance on dense AI hardware, structured pruning methods [5][6][7][8][9][10][11][12] are proposed with various structured sparsity patterns like filter, channel, block, and multiblock.…”
Section: Related Workmentioning
confidence: 99%
“…But the main issue with element pruning is that the generated sparse neural networks have irregular compute and memory access patterns due to unstructured sparsity pattern, and thus cannot be efficiently mapped onto dense AI hardware. Structured pruning methods 5‐12 are proposed to improve the run time performance of sparse neural networks. Unlike element pruning, where parameters are removed at an individual level, in structured pruning, parameters are first divided into structural units like filter, channel, block, multiblock, and so on and then are removed at a unit level based on the strength of the unit.…”
Section: Introductionmentioning
confidence: 99%