2020
DOI: 10.1007/978-3-030-50743-5_5
|View full text |Cite
|
Sign up to set email alerts
|

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training with TensorFlow

Abstract: To reduce the training time of large-scale Deep Neural Networks (DNNs), Deep Learning (DL) scientists have started to explore parallelization strategies like data-parallelism, model-parallelism, and hybrid-parallelism. While data-parallelism has been extensively studied and developed, several problems exist in realizing model-parallelism and hybrid-parallelism efficiently. Four major problems we focus on are: 1) defining a notion of a distributed model across processes, 2) implementing forward/back-propagation… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…GPU device memory is steadily increasing; however, developing complex DNN models will likely require GPU devices with hundreds of GBs to TBs of memory, similar to what is available in current CPU servers. Other developments addressing the issue of GPU memory limitations include combining both data-parallel approaches with model-parallel approaches (Awan et al, 2020, Van Essen et al, 2015).…”
Section: Discussionmentioning
confidence: 99%
“…GPU device memory is steadily increasing; however, developing complex DNN models will likely require GPU devices with hundreds of GBs to TBs of memory, similar to what is available in current CPU servers. Other developments addressing the issue of GPU memory limitations include combining both data-parallel approaches with model-parallel approaches (Awan et al, 2020, Van Essen et al, 2015).…”
Section: Discussionmentioning
confidence: 99%
“…Keras is a machine learning enhanced Application Programming Interface [API]. Keras APIs [11] are closely knitted in accordance with the Tensorflow 2.0 framework. This study has used the sequential model, which is considered as a collection of layers in plain stack.…”
Section: Keras Neural Network Modelmentioning
confidence: 99%