Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2021
DOI: 10.1145/3437801.3441609
|View full text |Cite
|
Sign up to set email alerts
|

I/O lower bounds for auto-tuning of convolutions in CNNs

Abstract: Convolution is the most time-consuming part in the computation of convolutional neural networks (CNNs), which have achieved great successes in numerous practical applications. Due to the complex data dependency and the increase in the amount of model samples, the convolution suffers from high overhead on data movement (i.e., memory access). This work provides comprehensive analysis and methodologies to minimize the communication for the convolution in CNNs. With an in-depth analysis of the recent I/O complexit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 28 publications
0
7
0
Order By: Relevance
“…Neural Networks. Analyzing I/O lower bounds of neural networks is a nascent field, and so far only single-layer convolution was analyzed [20,23]. We improve the previously-reported bound reported by Zhang et al [20] by a factor of 8.…”
Section: Discussionmentioning
confidence: 60%
See 3 more Smart Citations
“…Neural Networks. Analyzing I/O lower bounds of neural networks is a nascent field, and so far only single-layer convolution was analyzed [20,23]. We improve the previously-reported bound reported by Zhang et al [20] by a factor of 8.…”
Section: Discussionmentioning
confidence: 60%
“…The first asymptotic I/O lower bound for single-layer direct convolution was proved by Demmel et al [23]. Chen et al [55] propose a matching implementation, and Zhang et al [20] present the first non-asymptotic I/O lower bound for Winograd convolution.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…This novel technique is called deep reuse. Another recent but sophisticated approach [28] divides the Winograd algorithm in subcomputations and establishes the data movement lower bounds to finally define the optimal I/O dataflow that maximizes data re-reuse. Furthermore, they propose an auto-tuning technique to dynamically find the optimal parameter configuration.…”
Section: Related Workmentioning
confidence: 99%