2021 IEEE High Performance Extreme Computing Conference (HPEC) 2021
DOI: 10.1109/hpec49654.2021.9622850
|View full text |Cite
|
Sign up to set email alerts
|

The MIT Supercloud Dataset

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2
1

Relationship

3
6

Authors

Journals

citations
Cited by 18 publications
(10 citation statements)
references
References 17 publications
0
10
0
Order By: Relevance
“…Given the intensive compute resources required to conduct such scaling studies, we intend to make all experimental data from this study publicly available. as part of the MIT Supercloud Datacenter Challenge [35] via this https URL.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Given the intensive compute resources required to conduct such scaling studies, we intend to make all experimental data from this study publicly available. as part of the MIT Supercloud Datacenter Challenge [35] via this https URL.…”
Section: Discussionmentioning
confidence: 99%
“…Traditionally, HPC centers limit GPU usage to prevent users from misusing systems, while cloud providers eagerly allow users to provision as many resources as they can afford. Rarely do scientific DL practitioners examine their resource needs; most workflows are either run on a single GPU due to the lack of engineering infrastructure needed to scale, or are run on the maximum number of available GPUs [18,35]. Efficient training and scaling strategies may be even more important than architecture details in some domains [4,31,43].…”
Section: Introductionmentioning
confidence: 99%
“…We use various types of deep learning (DL) workloads because the recent advancement in DL algorithms has made them popular in scientific research and production datacenters [41][42][43]. We uniformly sample the DL model and training batch size from Table 2.…”
Section: Methodsmentioning
confidence: 99%
“…We use various types of deep learning (DL) workloads because the recent advancement in DL algorithms has made them popular in scientific research and production datacenters [43][44][45]. We uniformly sample the DL model and training batch size from Table 2.…”
Section: Methodsmentioning
confidence: 99%