2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2018
DOI: 10.1109/mascots.2018.00037
|View full text |Cite
|
Sign up to set email alerts
|

A Model-Based Approach to Streamlining Distributed Training for Asynchronous SGD

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
30
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(30 citation statements)
references
References 21 publications
0
30
0
Order By: Relevance
“…The closest work to our results is [29], which considers jobs which follow a realistic speedup function and have known sizes. [29] also allows server allocations to change over time.…”
Section: Prior Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The closest work to our results is [29], which considers jobs which follow a realistic speedup function and have known sizes. [29] also allows server allocations to change over time.…”
Section: Prior Workmentioning
confidence: 99%
“…The closest work to our results is [29], which considers jobs which follow a realistic speedup function and have known sizes. [29] also allows server allocations to change over time. [29] proposes and evaluates heuristic policies such as HELL and KNEE, but they make no theoretical guarantee about the performance of their policies.…”
Section: Prior Workmentioning
confidence: 99%
“…In particular, the jobs typically request multiple servers simultaneously and hold onto them for the duration of the job. This difference is largely a result of machine learning jobs like TensorFlow [6], which are highly parallel. For example, when we look at Google's Borg Scheduler [10], we see that the number of servers occupied by an individual job can be anywhere from 1 to 100000 [13].…”
Section: Introductionmentioning
confidence: 99%
“…Prior Work The sub-linear-sched problem has been an object of immense interest [3,4,11,7,8,1], where practical algorithms include packing based [18], and resource reservation algorithms [14]. Heuristic policies with only numerical performance analysis can be found in [13]. In past, this problem has been considered for the combinatorial discrete allocation model [11], where an integer number of servers are assigned to any job, as well as the continuous allocation model [7,8,1,3,4], that treats the N servers as a single resource block which can be partitioned into any size and assigned to any job.…”
Section: Introductionmentioning
confidence: 99%