Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Syste 2020
DOI: 10.1145/3373376.3378514
|View full text |Cite
|
Sign up to set email alerts
|

Interstellar

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
22
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 144 publications
(24 citation statements)
references
References 36 publications
2
22
0
Order By: Relevance
“…What characterizes the above-discussed dataflows is that all the operations along dimensions H k and W k are mapped to the 2D PE array and executed in parallel. This mapping operation is defined as spatial unrolling in [118]. From a software perspective this is equivalent to replacing the for loops in the 7-nested loop representation with parallel for loops (par_for) as in Figure 34.…”
Section: Spatial Architectures and Dataflow Processingmentioning
confidence: 99%
See 1 more Smart Citation
“…What characterizes the above-discussed dataflows is that all the operations along dimensions H k and W k are mapped to the 2D PE array and executed in parallel. This mapping operation is defined as spatial unrolling in [118]. From a software perspective this is equivalent to replacing the for loops in the 7-nested loop representation with parallel for loops (par_for) as in Figure 34.…”
Section: Spatial Architectures and Dataflow Processingmentioning
confidence: 99%
“…From a software perspective this is equivalent to replacing the for loops in the 7-nested loop representation with parallel for loops (par_for) as in Figure 34. In [118], the H k |W k syntax is adopted to denote which loops are parallelized. The stationarity of the weights is instead equivalent, from the software perspective, to a loop reordering operation of the for loops, as shown in Figure 34.…”
Section: Spatial Architectures and Dataflow Processingmentioning
confidence: 99%
“…The critical status of memory design has attracted extensive research. Most previous studies focus on simple layer-level optimization (the left one of Figure 1) by applying loop transformation techniques such as tiling and reordering to fit the memory size and reuse the on-chip data [23,43,44,61,70]. In addition, several works also guide the memory capacity and hierarchy design using designspace exploration [12,32,37,66,67].…”
Section: Introductionmentioning
confidence: 99%
“…Morph [56] provides a flexible architecture that can accelerate 3D convolutional neural networks (3D CNNs) having large memory requirement and higher dimensionality. Work by Chen et al [57] optimizes energy efficiency by ensuring proper dataflow mapping, however current results also show that choosing a specific dataflow mapping scheme for an architecture still needs a lot of investigation [58]. Neurocube [59] on the other hand provides a programmable and scalable digital architecture for brain-inspired algorithms.…”
Section: Existing Dnn Acceleratorsmentioning
confidence: 96%
“…Hardware architecture space explorations and optimizations involve inspection of a diverse set of 4.1. Introduction configurable architectural and algorithmic parameters, thus a rapid estimation of the hardware costs becomes extremely significant.The existing state-of-the-art (or SOTA) architectural simulators[73,86,90,121], simulate energy consumption and execution time of deep neural network processing at lower speed (a couple of minutes, hours, or even days to simulate a single operating point). Though prevalent machine learning techniques may improve runtime of such simulations[124][125][126], these mechanisms bank on the availability of a large training dataset with limited architectural options or parameters to explore and optimize.A likely possibility to achieve an extremely fast estimation of energy consumption and performance of deep neural network processing is to formulate and analyse closed-form analytical representations of the target cost functions.…”
mentioning
confidence: 99%