2021
DOI: 10.1109/mm.2021.3061912
|View full text |Cite
|
Sign up to set email alerts
|

Compute Substrate for Software 2.0

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 2 publications
0
11
0
Order By: Relevance
“…However, most AI multi-core chips are commercial products and the architecture details are not available, for example, Cerebras WSE series [20] and Tenstorrent [21]. To our best knowledge, only Graphcore's Intelligent Processing Unit reveals that its parallel computing model [17] and only Tenstorrent claims the support of model-level parallelism [22].…”
Section: Upsurging Of Mimd Multi-core Acceleratorsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, most AI multi-core chips are commercial products and the architecture details are not available, for example, Cerebras WSE series [20] and Tenstorrent [21]. To our best knowledge, only Graphcore's Intelligent Processing Unit reveals that its parallel computing model [17] and only Tenstorrent claims the support of model-level parallelism [22].…”
Section: Upsurging Of Mimd Multi-core Acceleratorsmentioning
confidence: 99%
“…Uneven distribution of communication load in env groups will lead to a remarkable decrease in overall performance. A usual solution is shown in Figure 6, which pipelines upstream computing, communication, and downstream computing to maximize their overlap [22]. In Figure 6, C-1 means The 1st segment of the Conv in L1.…”
Section: Realizing Load Balance For High Hardware Utilizationmentioning
confidence: 99%
“…As Deep Neural Networks (DNNs) tackle increasingly complex problems, their size and complexity grow rapidly, resulting in increased computing and storage demands [9], [10], [51], [56], [63]. While applying more advanced technology and enlarging single chip sizes have led to the development of many large-scale monolithic accelerators with tens of billions of transistors [23], [24], [28], [55], the end of Moore's Law [35] and limited photomask size pose significant challenges to further transistor integration. Chiplet technology, using advanced packaging to combine small functional dies, offers a promising solution to overcome these limitations and enable continuous transistor integration.…”
Section: Introductionmentioning
confidence: 99%
“…For the first challenge, maintaining high utilization and energy efficiency becomes increasingly difficult with the growing scale of accelerators. LP mapping, in which multiple layers are spatially mapped onto the accelerator, is widely employed by large-scale accelerators in both academia [15], [45], [46], [57] and industry [21], [52], [55] to relieve the challenge. The core of LP mapping is spatial mapping (SPM), which determines which part of which layer is allocated to which core, and significantly impacts the performance and energy efficiency of large-scale accelerators.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation