GPU swap-aware scheduler

Yang, Su-Wei; Qiu, Zhao-Wei; Chen, Ya-Shu

doi:10.1145/3341105.3373866

Proceedings of the 35th Annual ACM Symposium on Applied Computing 2020

DOI: 10.1145/3341105.3373866

|View full text |Cite

GPU swap-aware scheduler

Su-Wei Yang

Zhao-Wei Qiu

Ya-Shu Chen

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management

Chen,

Dong,

Zhang

et al. 2024

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

Due to the limited GPU memory, the performance of large DNNs training is constrained by the unscalable batch size. Existing researches partially address the issue of GPU memory limit through tensor recomputation and swapping, but overlook the exploration of optimal performance. In response, we propose ATP, a recomputation and swapping based GPU memory management framework that aims to maximize training performance by breaking GPU memory constraints. ATP utilizes a throughput model we proposed to evaluate the theoretical peak performance achievable by DNN training on GPU, and provide the optimum memory size required for recomputation and swapping. We optimize the mechanisms for GPU memory pool and CUDA stream control, employs an optimization method to search for specific tensors requiring recomputation and swapping, thereby bringing the actual DNN training performance on ATP closer to theoretical values. Evaluations with different types of large DNN models indicate that ATP achieve throughput improvements ranging from 1.14 ∼ 1.49 ×, while support model training exceeding the GPU memory limit by up to 9.2 ×.

show abstract

ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management

Chen,

Dong,

Zhang

et al. 2024

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

GPU swap-aware scheduler

Cited by 1 publication

References 16 publications

ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management

ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management

Contact Info

Product

Resources

About