Authorea
DOI: 10.22541/au.158145130.06115167
|View full text |Cite
|
Sign up to set email alerts
|

In quest for the thinking machine

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…In Figure 3 we highlight the potential realized gains with unstructured weight sparsity on specialized hardware for deep learning such as the Cerebras CS-2. This figure was regenerated based on the plot in (Lie, 2021).…”
Section: Unstructured Sparsity On Specialized Hardware Acceleratorsmentioning
confidence: 99%
See 3 more Smart Citations
“…In Figure 3 we highlight the potential realized gains with unstructured weight sparsity on specialized hardware for deep learning such as the Cerebras CS-2. This figure was regenerated based on the plot in (Lie, 2021).…”
Section: Unstructured Sparsity On Specialized Hardware Acceleratorsmentioning
confidence: 99%
“…In this work, we show how we can leverage weight sparsity to reduce training FLOPs, and then recover the lost representational capacity by shifting to dense weight matrices when fine-tuning on downstream tasks. In addition, while specialized software kernels have been developed to achieve inference acceleration with unstructured sparsity NeuralMagic, 2021;Elsen et al, 2019;Ashby et al, 2019;Wang, 2021), recent work has shown that we can realize the gains of unstructured weight sparsity on specialized hardware (e.g., Cerebras CS-2 (Lie, 2022;2021)) when training LLMs. For example, Lie (2021) shows the measured speedup for a matrix multiplication kernel w.r.t to the sparsity level on a single GPT-3 layer (see Appendix C for more details).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations