Quincy

Isard, Michael; Prabhakaran, Vijayan; Currey, Jon; Wieder, Udi; Talwar, Kunal; Goldberg, Andrew V.

doi:10.1145/1629575.1629601

Cited by 615 publications

(45 citation statements)

References 24 publications

(23 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most burden the user with the choice of how many parallel tasks to use [68, §5], or rely on a separate "auto-scaling" component based on coarse heuristics [9,28]. Indeed, many fair schedulers [31,43] divide resources without paying attention to their decisions' efficiency: sometimes, an "unfair" schedule results in a more efficient overall execution.…”

Section: Setting the Right Level Of Parallelismmentioning

confidence: 99%

Learning scheduling algorithms for data processing clusters

Mao¹,

Schwarzkopf²,

Venkatakrishnan³

et al. 2019

Proceedings of the ACM Special Interest Group on Data Communication

500

292

View full text Add to dashboard Cite

Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems use simple, generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically.Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective, such as minimizing average job completion time. However, off-the-shelf RL techniques cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals.Our prototype integration with Spark on a 25-node cluster shows that Decima improves average job completion time by at least 21% over hand-tuned scheduling heuristics, achieving up to 2× improvement during periods of high cluster load.

show abstract

Section: Setting the Right Level Of Parallelismmentioning

confidence: 99%

Learning scheduling algorithms for data processing clusters

Mao¹,

Schwarzkopf²,

Venkatakrishnan³

et al. 2019

Proceedings of the ACM Special Interest Group on Data Communication

500

292

View full text Add to dashboard Cite

show abstract

“…configurations while ignoring specific QoS metrics such as response time, reliability, or security. Other systems such as YARN [55] , Apache Hadoop and Quincy [56] use a system centric fairness (e.g. CPU share or memory share) policy to map jobs to resources.…”

Section: Qos-aware Workload Scheduling In Datacenter Cloudsmentioning

confidence: 99%

Stochastic Workload Scheduling for Uncoordinated Datacenter Clouds with Multiple QoS Constraints

Chen

Wang

Chen

et al. 2020

IEEE Trans. Cloud Comput.

View full text Add to dashboard Cite

Cloud computing becomes a well-adopted computing paradigm. With the unprecedented scalability and flexibility, the computational cloud is able to carry out large scale computing tasks in the parallel fashion. The datacenter cloud is a new cloud computing model that use multi-datacenter architectures for large scale massive data processing or computing. In datacenter cloud computing, the overall efficiency of the cloud depends largely on the workload scheduler, which allocates clients' tasks to different Cloud datacenters. Developing high performance workload scheduling techniques in Cloud computing imposes a great challenge which has been extensively studied. Most previous works aim only at minimizing the completion time of all tasks. However, timeliness is not the only concern, while reliability and security are also very important. In this work, a comprehensive Quality of Service (QoS) model is proposed to measure the overall performance of datacenter clouds. An advanced Cross-Entropy based stochastic scheduling (CESS) algorithm is developed to optimize the accumulative QoS and sojourn time of all tasks. Experimental results show that our algorithm improves accumulative QoS and sojourn time by up to 56.1% and 25.4% compared to the baseline algorithm, respectively. The runtime of our algorithm grows only linearly with the number of Cloud datacenters and tasks. Given the same arrival rate and service rate ratio, our algorithm steadily generates scheduling solutions with satisfactory QoS without sacrificing sojourn time. Index Terms-Cloud Computing, DataCenter Clouds, Quality of Service, Workload Scheduling ! 1 INTRODUCTION C LOUD computing [1], which delivers computing as a service, has emerged as a well-adopted computing paradigm which offers vast computing power and flexibility, and an increasing number of commercial cloud computing services are deployed into the market such as Amazon EC2 [2], Google Compute Engine [3], and Rackspace Cloud [4]. The new computing paradigms of "Cloud of Clouds" [5] and "datacenter clouds" [6], [7] are a creation of federated Cloud computing environment that coordinates distributed datacenter computing and achieves high QoS for Cloud applications. Large-scale data-intensive applications across distributed modern datacenter infrastructures is a good implementation and use case of the "Cloud of Clouds" paradigm. A good example for data-intensive analysis is the field of High Energy Physics (HEP). The four main detectors including ALICE, ATLAS, CMS and LHCb at the Large Hadron Collider (LHC) produced about 13 petabyes of data in 2010 [8]. This huge amount of data are stored on the Worldwide LHC Computing Grid that consists of more

show abstract

“…Carrera, Steinder, Whalley, Torres, and Ayguadé (2008) try to guarantee performance fairness between web applications, in which, the CPU time is considered as the dominant resource. Isard et al (2009) guarantee the fairness for parallel computing in clusters, and aim to guarantee fair serving time to parallel jobs. Depending on a queue-based scheduler, the jobs could share processing time fairly.…”

Section: Related Workmentioning

confidence: 99%

Fairly Sharing the Network for Multitier Applications in Clouds

Jin

et al. 2015

International Journal of Web Services Research

View full text Add to dashboard Cite

A significant trend caused by cloud computing is to aggregate applications for sharing resources. Thus, it is necessary to provide fair resources and performance among applications, especially for the network, which is provided in the best-effort manner in current clouds. Although many studies have made efforts for provisioning fair bandwidth, they are not sufficient for network fairness. In fact, for interactive applications, response time is more sensitive than bandwidth, and users expect a fair response time not just bandwidth. In this study, the authors want to investigate whether the traditional methods of sharing bandwidth can help the fairness of response time. They show that: (1) bandwidth has little relationship to response time, and adjusting bandwidth hardly affects response time in most cases. Thus, the traditional methods cannot help the fairness of response time much; and (2) the fairness between components is different from the one between transactions, and many prior studies only consider the former while ignoring the latter. Thus, the authors cannot help much for multitier applications consisting of multiple transactions either. As a result, they construct a model with two metrics to evaluate the fairness status of the network sharing, while considering the applications' characteristics on both the response time and throughput. Based on the model, they also propose a mechanism to improve the fairness status. The evaluation results show that the authors' mechanism improves the fairness status by 26.5%–52.8%, and avoids performance degradation compared to some practical mechanisms.

show abstract

Quincy

Cited by 615 publications

References 24 publications

Learning scheduling algorithms for data processing clusters

Learning scheduling algorithms for data processing clusters

Stochastic Workload Scheduling for Uncoordinated Datacenter Clouds with Multiple QoS Constraints

Fairly Sharing the Network for Multitier Applications in Clouds

Contact Info

Product

Resources

About