2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) 2021
DOI: 10.1109/hpca51647.2021.00056
|View full text |Cite
|
Sign up to set email alerts
|

Layerweaver: Maximizing Resource Utilization of Neural Processing Units via Layer-Wise Scheduling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 28 publications
(10 citation statements)
references
References 39 publications
0
10
0
Order By: Relevance
“…Due to the wide spectrum of NPUs and DNN models, it is nearly impossible to balance the usage of NPU resources for all DNNs. Recently, several proposals have addressed this problem by time-multiplexing layer-wise execution of multiple DNN models with opposing characteristics (e.g., memory-intensive and compute-intensive) to saturate both compute and memory bandwidth [5], [12]. Figure 1 vidual requests.…”
Section: Limitations Of the Prior Artmentioning
confidence: 99%
See 4 more Smart Citations
“…Due to the wide spectrum of NPUs and DNN models, it is nearly impossible to balance the usage of NPU resources for all DNNs. Recently, several proposals have addressed this problem by time-multiplexing layer-wise execution of multiple DNN models with opposing characteristics (e.g., memory-intensive and compute-intensive) to saturate both compute and memory bandwidth [5], [12]. Figure 1 vidual requests.…”
Section: Limitations Of the Prior Artmentioning
confidence: 99%
“…Layerweaver+ replaces the greedy scheduler of Layerweaver [5] to achieve the two design goals: maximizing the inference throughput and bounding each request's latency to a given deadline. To this end, we introduce two operation modes in Layerweaver+: 1) Select the layer that causes the minimum resource idle time first, and 2) Select the layer with the minimum QoS slack time.…”
Section: Design and Implementationmentioning
confidence: 99%
See 3 more Smart Citations