Statistics-driven workload modeling for the Cloud

Ganapathi, Archana; Chen, Yanpei; Fox, Armando; Katz, Randy H.; Patterson, David A.

doi:10.1109/icdew.2010.5452742

Cited by 164 publications

(69 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…For example, studies on how to predict MapReduce job running times [20], [21] can evaluate their mechanisms on realistic job mixes. Studies on MapReduce energy efficiency [22], [23] can quantify energy savings under realistic workload fluctuations.…”

Section: Towards Mapreduce Workload Suitesmentioning

confidence: 99%

The Case for Evaluating MapReduce Performance Using Workload Suites

Chen

Ganapathi

Griffith

et al. 2011

2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems

Self Cite

327

235

View full text Add to dashboard Cite

Abstract-MapReduce systems face enormous challenges due to increasing growth, diversity, and consolidation of the data and computation involved. Provisioning, configuring, and managing large-scale MapReduce clusters require realistic, workloadspecific performance insights that existing MapReduce benchmarks are ill-equipped to supply.In this paper, we build the case for going beyond benchmarks for MapReduce performance evaluations. We analyze and compare two production MapReduce traces to develop a vocabulary for describing MapReduce workloads. We show that existing benchmarks fail to capture rich workload characteristics observed in traces, and propose a framework to synthesize and execute representative workloads. We demonstrate that performance evaluations using realistic workloads gives cluster operator new ways to identify workload-specific resource bottlenecks, and workload-specific choice of MapReduce task schedulers.We expect that once available, workload suites would allow cluster operators to accomplish previously challenging tasks beyond what we can now imagine, thus serving as a useful tool to help design and manage MapReduce systems.

show abstract

Section: Towards Mapreduce Workload Suitesmentioning

confidence: 99%

The Case for Evaluating MapReduce Performance Using Workload Suites

Chen

Ganapathi

Griffith

et al. 2011

2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems

Self Cite

327

235

View full text Add to dashboard Cite

show abstract

“…The tasks in a bag usually are assumed independent of each other [13] or a set of sequential tasks (possibly only one) [12]. For MapReduce applications, Yanpei Chen [14,15] offered a general MapReduce application definition, in which the execution of each MapReduce job is divided into three stages: map(input)/shuffle/reduce(output), a job is specified by input data-size, input/shuffle/output data ratio and data format. All above models for specific application types are not general enough to support multi-application-types workloads in reality.…”

Section: Related Workmentioning

confidence: 99%

“…burst or diurnal pattern), the correspondence between workload fluctuation and the variation of resource requirements. For example, [14] used KCCA (Kernel Canonical Correlation Analysis) method to predict the execution time of MapReduce jobs. And [33] studied the optimized resources scaling options based on workload variation under the premise of ensuring SLA.…”

Section: Related Workmentioning

confidence: 99%

Resource Demand Forecasting Model Based on Dynamic Cloud Workload

C¹,

Zhou²

2017

Preprint

View full text Add to dashboard Cite

Abstract:The primary attraction of IaaS is providing elastic resources on demand. It becomes imperative that IaaS-users have an effective methodology for learning what resources they require, how many resources and for how long they need. However, the heterogeneity of resources, the diversity resource demands of different cloud applications and the variation of application-user behaviors pose IaaS-users big challenge. In this paper, we purpose a unified resource demand forecasting model suiting for different applications, various resources and diverse time-varying workload patterns. With the model, taking input from parameterized applications, resources and workload scenarios, the corresponding resources demands during any time interval can be deduced as output. The experiments configure concrete functions and parameters to help understanding the above model.

show abstract

“…More sophisticated resource estimation models including those based on machine learning techniques such as neural networks have been developed for workloads ranging from transaction oriented (i.e. OLTP) to data intensive computations [63][64][65][66]. Based on the predicted resource requirements from the observation window, the execution environment can then provision the resources for the next prediction window:…”

Section: Distributed Resource Provisioningmentioning

confidence: 99%

“…The primary challenges in developing a workload forecasting mechanism include [63][64][65][66][69][70][71]: (1) potential overheads related to change of provisioned resources as it will take time to properly set up resources before they can be used by the workload, (2) ability to accurately predict future workload behavior, and (3) ability to compute the right amount of resources required for the expected increase or decrease in workload [62]. The general framework of such a scheduling mechanism can be represented by the pseudocode below: In this mechanism, an observation window of length w is set up for the workload to collect the behavior pattern in terms of resource consumption of the workload.…”

Section: Distributed Resource Provisioningmentioning

confidence: 99%

Composable architecture for rack scale big data computing

Franke

Parris

et al. 2017

Future Generation Computer Systems

View full text Add to dashboard Cite

Keywords:Big data platforms, Composable system architecture, Disaggregated datacenter architecture, composable datacenter, software defined environments, software defined networking. Abstract:The rapid growth of cloud computing, both in terms of the spectrum and volume of cloud workloads, necessitate re-visiting the traditional rack-mountable servers based datacenter design. Next generation datacenters need to offer enhanced support for: (i) fast changing system configuration requirements due to workload constraints, (ii) timely adoption of emerging hardware technologies, and (iii) maximal sharing of systems and subsystems in order to lower costs. Disaggregated datacenters, constructed as a collection of individual resources such as CPU, memory, disks etc., and composed into workload execution units on demand, are an interesting new trend that can address the above challenges. In this paper, we demonstrated the feasibility of composable systems through building a rack scale composable system prototype using PCIe switch. Through empirical approaches, we develop assessment of the opportunities and challenges for leveraging the composable architecture for rack scale cloud datacenters with a focus on big data and NoSQL workloads. In particular, we compare and contrast the programming models that can be used to access the composable resources, and developed the implications for the network and resource provisioning and management for rack scale architecture.

show abstract

Statistics-driven workload modeling for the Cloud

Cited by 164 publications

References 6 publications

The Case for Evaluating MapReduce Performance Using Workload Suites

The Case for Evaluating MapReduce Performance Using Workload Suites

Resource Demand Forecasting Model Based on Dynamic Cloud Workload

Composable architecture for rack scale big data computing

Contact Info

Product

Resources

About