Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles 2013
DOI: 10.1145/2517349.2522716
|View full text |Cite
|
Sign up to set email alerts
|

Sparrow

Abstract: Large-scale data analytics frameworks are shifting towards shorter task durations and larger degrees of parallelism to provide low latency. Scheduling highly parallel jobs that complete in hundreds of milliseconds poses a major challenge for task schedulers, which will need to schedule millions of tasks per second on appropriate machines while offering millisecond-level latency and high availability. We demonstrate that a decentralized, randomized sampling approach provides near-optimal performance while avoid… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
50
0
1

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 439 publications
(51 citation statements)
references
References 16 publications
0
50
0
1
Order By: Relevance
“…However, with the continuous scaling of cluster size, monolithic schedulers such as Hadoop Scheduler cannot meet the requirement of scalability. Two-level schedulers, such as Yarn and Mesos, distributed schedulers, such as Omega [4], Sparrow [5] and Apollo [6], Borg [7], Torcil [8], and hybrid schedulers, such as Hawk [9], are proposed to figure out some solutions to this problem. Due to decentralized decisions, the distributed schedulers are more favorable for efficient scheduling in large scale clusters.…”
Section: Background and Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…However, with the continuous scaling of cluster size, monolithic schedulers such as Hadoop Scheduler cannot meet the requirement of scalability. Two-level schedulers, such as Yarn and Mesos, distributed schedulers, such as Omega [4], Sparrow [5] and Apollo [6], Borg [7], Torcil [8], and hybrid schedulers, such as Hawk [9], are proposed to figure out some solutions to this problem. Due to decentralized decisions, the distributed schedulers are more favorable for efficient scheduling in large scale clusters.…”
Section: Background and Motivationmentioning
confidence: 99%
“…However, the scaling cluster size and more complicated application requirements let efficient scheduling to be a challenge for resource management system. Therefore, many schedulers, such as Hadoop Scheduler [1], Mesos [2], YARN [3], Omega [4], Sparrow [5] and Apollo [6], Borg [7], Torcil [8], Hawk [9] have been emerged in recent years.…”
Section: Introductionmentioning
confidence: 99%
“…AutoPro requires each SLO-bound VM to make periodic performance reports available to its controller, in order to leverage its resource-performance models, as proposed in previous works [Zhang et al 2002;Padala et al 2009;Shen et al 2011;Sironi et al 2012;Bartolini et al 2013a;Hoffmann et al 2013;Sironi et al 2014]. Any performance metric meaningful to the user can be used for these reports and to express SLOs; for instance, a web server can use throughput (e.g., requests/s for a web server) or latency (i.e., response time).…”
Section: Performance Metrics and Measurementsmentioning
confidence: 99%
“…Since PARSEC applications do not natively report performance at runtime, we instrument a subset of the suite 5 to report throughput through our efficient user-space implementation of the Application Heartbeats API [Hoffmann et al 2010;Sironi et al 2012;Bartolini et al 2013a;Sironi et al 2014]. The hypervisor accesses VM performance measurements as in previous work [Padala et al 2009;Shen et al 2011]. In real deployments, performance reports may be obtained from application logs or monitoring infrastructures.…”
Section: Performance Metrics and Measurementsmentioning
confidence: 99%
See 1 more Smart Citation