2022
DOI: 10.21203/rs.3.rs-2266264/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Workload Analysis and Prediction of Multi-type GPU in Heterogeneous GPU Clusters

Abstract: Heterogeneous GPU clusters play an important role in processing parallel applications and massive data sets in the cloud platform. However, due to the diversity of GPU types, how to effectively allocate various GPU types is a challenge. This paper first analyzes the characteristics of request and allocation for various GPU types based on Alibaba cluster data. Then we propose a method to adaptively select the best model to predict demand of various GPU types, and feature extraction from the best model. Further,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(6 citation statements)
references
References 23 publications
0
6
0
Order By: Relevance
“…We considered a subset of the ImageNet competition dataset [38], including 10 classes with 1300 images each. We trained jobs exploiting ResNet [39], VGG16 [40], AlexNet [41] and MobileNetV2 [42], varying the batch size (16,32,64) and the optimizer (Adam, SGD). We set to 100 the maximum number of epochs for all jobs and record after how many epochs the jobs terminate using a stopping criterion based on patience.…”
Section: A Experimental Setup and Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…We considered a subset of the ImageNet competition dataset [38], including 10 classes with 1300 images each. We trained jobs exploiting ResNet [39], VGG16 [40], AlexNet [41] and MobileNetV2 [42], varying the batch size (16,32,64) and the optimizer (Adam, SGD). We set to 100 the maximum number of epochs for all jobs and record after how many epochs the jobs terminate using a stopping criterion based on patience.…”
Section: A Experimental Setup and Methodologymentioning
confidence: 99%
“…Note that, while deciding to deploy jobs on single nodes may seem a limitation, recent DL workloads analysis [32] highlighted how over 50% of jobs require a single GPU, while jobs exploiting more than 8 GPUs (which we will consider as the maximum node size in our experimental evaluation) are less than 10%. Moreover, enforcing GPU locality yields over 10× speed-up [16].…”
Section: Resource Selection-job Scheduling Problemmentioning
confidence: 99%
See 1 more Smart Citation
“…To compute the SLO attainment with a given set of requests and placement, in AlpaServe, we assume we know the arrival process in advance. Although short-term burstiness is impossible to predict, the arrival pattern over longer timescales (e.g., hours or days) is often predictable [43]. Given this predictability, AlpaServe either directly uses the history request traces or fits a distribution from the trace and resamples new traces from the distribution as the input workload to the simulator to compute the SLO attainment.…”
Section: Placement Algorithmmentioning
confidence: 99%
“…Furthermore, there is significant and unpredictable burstiness in the arrival process of user requests. To meet tight SLO, contemporary serving systems are forced to over-provision compute resources, resulting in low cluster utilization [43].…”
Section: Introductionmentioning
confidence: 99%