Automated profiling and resource management of pig programs for meeting service level objectives

Zhang, Zhuoyao; Cherkasova, Ludmila; Verma, Abhishek; Loo, Boon Thau

doi:10.1145/2371536.2371546

Cited by 33 publications

(21 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It then finds the minimum number of slots that are required to meet a running time constraint using Lagrange multipliers. Extensions have appeared in [18], where trade-off curves between running time and monetary cost are provided to the user, who makes the final choice, and in [19], where the number of map and reduce slots are optimally decided. All these techniques are specific to a MapReduce setting running on cloud machines and assume an analytical cost model that is even simpler than the one in [7], which is extended by our work.…”

Section: Related Workmentioning

confidence: 99%

Dynamic Configuration of Partitioning in Spark Applications

Gounaris

Kougka

Tous

et al. 2017

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Spark has become one of the main options for large-scale analytics running on top of shared-nothing clusters. This work aims to make a deep dive into the parallelism configuration and shed light on the behavior of parallel spark jobs. It is motivated by the fact that running a Spark application on all the available processors does not necessarily imply lower running time, while may entail waste of resources. We first propose analytical models for expressing the running time as a function of the number of machines employed. We then take another step, namely to present novel algorithms for configuring dynamic partitioning with a view to minimizing resource consumption without sacrificing running time beyond a user-defined limit. The problem we target is NP-hard. To tackle it, we propose a greedy approach after introducing the notions of dependency graphs and of the benefit from modifying the degree of partitioning at a stage; complementarily, we investigate a randomized approach. Our polynomial solutions are capable of judiciously use the resources that are potentially at user's disposal and strike interesting trade-offs between running time and resource consumption. Their efficiency is thoroughly investigated through experiments based on real execution data.

show abstract

Section: Related Workmentioning

confidence: 99%

Dynamic Configuration of Partitioning in Spark Applications

Gounaris

Kougka

Tous

et al. 2017

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

show abstract

“…Enforcing high-level scheduling policies and fair sharing have been explored in the context of distributed storage systems [30,31,62,67,73,74]; however, they typically consider simpler execution structures (e.g., client to server) whereas Wisp focuses on a general DAG wherein individual processes lack end-to-end visibility. Lastly, several proposals exist for optimizing job completion times for DAGs of tasks in big-data systems [11,28,77,78]. However, data analytics jobs are often orders of magnitude longer than those serviced by the SOA systems targeted by Wisp (which operate under the additional constraint of limited end-to-end visibility).…”

Section: Related Workmentioning

confidence: 99%

Distributed resource management across process boundaries

Suresh

Bodík

Menache

et al. 2017

Proceedings of the 2017 Symposium on Cloud Computing

View full text Add to dashboard Cite

Multi-tenant distributed systems composed of small services, such as Service-oriented Architectures (SOAs) and Micro-services, raise new challenges in attaining high performance and efficient resource utilization. In these systems, a request execution spans tens to thousands of processes, and the execution paths and resource demands on different services are generally not known when a request first enters the system. In this paper, we highlight the fundamental challenges of regulating load and scheduling in SOAs while meeting end-to-end performance objectives on metrics of concern to both tenants and operators. We design Wisp, a framework for building SOAs that transparently adapts rate limiters and request schedulers systemwide according to operator policies to satisfy end-to-end goals while responding to changing system conditions. In evaluations against production as well as synthetic workloads, Wisp successfully enforces a range of end-to-end performance objectives, such as reducing average latencies, meeting deadlines, providing fairness and isolation, and avoiding system overload.

show abstract

“…A MapReduce performance model relying on a compact job profile definition to calculate a lower bound, an upper bound and an estimation of job execution time is presented. Finally, such model, improved in [32], is validated through a simulation study and an experimental campaign on a 66-nodes Hadoop cluster.…”

Section: Related Workmentioning

confidence: 99%

Optimal Map Reduce Job Capacity Allocation in Cloud Systems

Malekimajd

Ardagna

Ciavotta

et al. 2015

SIGMETRICS Perform. Eval. Rev.

View full text Add to dashboard Cite

We are entering a Big Data world. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to provide performance for MapReduce jobs and minimize cloud resource costs. The contribution of this paper is twofold: (i) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters, (ii) we formulate a linear programming model able to minimize cloud resources costs and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees. Simulation results show how the execution time of MapReduce jobs falls within 14% of our upper bound on average. Moreover, numerical analyses demonstrate that our method is able to determine the global optimal solution of the linear problem for systems including up to 1,000 user classes in less than 0.5 seconds.

show abstract

Automated profiling and resource management of pig programs for meeting service level objectives

Cited by 33 publications

References 17 publications

Dynamic Configuration of Partitioning in Spark Applications

Dynamic Configuration of Partitioning in Spark Applications

Distributed resource management across process boundaries

Optimal Map Reduce Job Capacity Allocation in Cloud Systems

Contact Info

Product

Resources

About