Betzabeth León scite author profile

Nowadays, Deep Learning (DL) applications have become a necessary solution for analyzing and making predictions with big data in several areas. However, DL applications introduce heavy input/output (I/O) loads on computer systems. These types of applications, when running on distributed systems or distributed memory parallel systems, handle a large amount of information that must be read in the training stage. Inherently parallel and distributed systems and persistent file accesses can easily overwhelm traditional shared file systems and negatively impact application performance. In this way, the management of these applications constitutes a constant challenge due to their popularity in HPC systems. Scientific applications or simulators have traditionally been executed and are optimized for this type systems. Therefore, it is essential to identify the key factors involved in the I/O of a DL application to find the most appropriate form of configuration to minimize the impact of I/O on the performance of this type of application. In the present work, we present an analysis of the behavior of the patterns generated by I/O operations in the training stage of distributed deep learning applications. We selected two well-known datasets such as CIFAR and MNIST to describe file access patterns.

show abstract

Analyzing the I/O Patterns of Deep Learning Applications

Parraga

León

Bond

et al. 2021

View full text Add to dashboard Cite

Correction to: A model of checkpoint behavior for applications that have I/O

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Betzabeth León

Analysis of parallel application checkpoint storage for system configuration

Analysis of Checkpoint I/O Behavior

File Access Patterns of Distributed Deep Learning Applications

Analyzing the I/O Patterns of Deep Learning Applications

Correction to: A model of checkpoint behavior for applications that have I/O

Contact Info

Product

Resources

About