2021
DOI: 10.48550/arxiv.2102.00554
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Abstract: The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
58
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 43 publications
(60 citation statements)
references
References 126 publications
(180 reference statements)
2
58
0
Order By: Relevance
“…We assume all sparse approaches use the coordinate (COO) format to store the sparse gradient, which consumes 2𝑘 storage, i.e., 𝑘 values plus 𝑘 indexes. There are other sparse formats (see [22] for an overview), but format selection for a given sparsity is not the topic of this work. To model the communication overhead, we assume bidirectional and direct point-to-point communication between the compute nodes, and use the classic latency-bandwidth cost model.…”
Section: Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation
“…We assume all sparse approaches use the coordinate (COO) format to store the sparse gradient, which consumes 2𝑘 storage, i.e., 𝑘 values plus 𝑘 indexes. There are other sparse formats (see [22] for an overview), but format selection for a given sparsity is not the topic of this work. To model the communication overhead, we assume bidirectional and direct point-to-point communication between the compute nodes, and use the classic latency-bandwidth cost model.…”
Section: Algorithmsmentioning
confidence: 99%
“…Only the nonzero values of the distributed gradients are accumulated across all processes. See [22] for an overview of gradient and other sparsification approaches in deep learning.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, there has been significant research interest in pruning techniques, and hundreds of different sparsification approaches have been proposed; please see the recent surveys of [15] and [25] for a comprehensive exposition. We categorize existing pruning methods as follows.…”
Section: Sparsification Techniquesmentioning
confidence: 99%
“…The increasing computational and storage costs of deep learning models have led to significant academic and industrial interest in model compression, which is roughly the task of obtaining smaller-footprint models matching the accuracy of larger baseline models. Model compression is a rapidly-developing area, and several generic approaches have been investigated, among which pruning and quantization are among the most popular [16,25].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation