2018
DOI: 10.48550/arxiv.1802.06905
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Communication-Optimal Convolutional Neural Nets

Abstract: Efficiently executing convolutional neural nets (CNNs) is important in many machinelearning tasks. Since the cost of moving a word of data, either between levels of a memory hierarchy or between processors over a network, is much higher than the cost of an arithmetic operation, minimizing data movement is critical to performance optimization. In this paper, we present both new lower bounds on data movement needed for both convolutional and pooling layers of CNNs, and optimal sequential algorithms that attain t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
20
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(20 citation statements)
references
References 6 publications
0
20
0
Order By: Relevance
“…Lemma 4.2. For τ ∈ Z + and the nested bilinear algorithm F = (A ⊗τ , B ⊗τ , C ⊗τ ), where X ⊗τ = τ i=1 X and (A, B, C) is the bilinear algorithm for Strassen's base algorithm (Definition 7.1), (3) is an expansion bound for F.…”
Section: Fast Matrix Multiplicationmentioning
confidence: 99%
See 3 more Smart Citations
“…Lemma 4.2. For τ ∈ Z + and the nested bilinear algorithm F = (A ⊗τ , B ⊗τ , C ⊗τ ), where X ⊗τ = τ i=1 X and (A, B, C) is the bilinear algorithm for Strassen's base algorithm (Definition 7.1), (3) is an expansion bound for F.…”
Section: Fast Matrix Multiplicationmentioning
confidence: 99%
“…Repeatedly applying Corollary 3.7, we find σ(k) remains a rank expansion lower bound for A ⊗τ as well as B ⊗τ and C ⊗τ . Then for k ∈ [7 τ ] and P ∈ P 3) to both sides leads to…”
Section: Fast Matrix Multiplicationmentioning
confidence: 99%
See 2 more Smart Citations
“…Since analyzing programs with parametric sizes disallows the construction of an explicit Computation Directed Acyclic Graph (CDAG), some form of parameterization is often needed [18][19][20]. However, we argue that the widely-used approaches based on the Loomis-Whitney or the HBL inequalities [21][22][23] (a) are often too restrictive, requiring the programs to be expressed in the polyhedral model to count the points in the projection polytopes; (b) do not capture pebbling motifs such as recomputation [19]; or (c) are limited to single-statement programs [7, 21-23, 23, 24].…”
Section: Introductionmentioning
confidence: 99%