2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010) 2010
DOI: 10.1109/icdew.2010.5452742
|View full text |Cite
|
Sign up to set email alerts
|

Statistics-driven workload modeling for the Cloud

Abstract: Abstract-A recent trend for data-intensive computations is to use pay-as-you-go execution environments that scale transparently to the user. However, providers of such environments must tackle the challenge of configuring their system to provide maximal performance while minimizing the cost of resources used. In this paper, we use statistical models to predict resource requirements for Cloud computing applications. Such a prediction framework can guide system design and deployment decisions such as scale, sche… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
68
0
1

Year Published

2011
2011
2020
2020

Publication Types

Select...
4
2
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 164 publications
(69 citation statements)
references
References 6 publications
0
68
0
1
Order By: Relevance
“…For example, studies on how to predict MapReduce job running times [20], [21] can evaluate their mechanisms on realistic job mixes. Studies on MapReduce energy efficiency [22], [23] can quantify energy savings under realistic workload fluctuations.…”
Section: Towards Mapreduce Workload Suitesmentioning
confidence: 99%
“…For example, studies on how to predict MapReduce job running times [20], [21] can evaluate their mechanisms on realistic job mixes. Studies on MapReduce energy efficiency [22], [23] can quantify energy savings under realistic workload fluctuations.…”
Section: Towards Mapreduce Workload Suitesmentioning
confidence: 99%
“…The tasks in a bag usually are assumed independent of each other [13] or a set of sequential tasks (possibly only one) [12]. For MapReduce applications, Yanpei Chen [14,15] offered a general MapReduce application definition, in which the execution of each MapReduce job is divided into three stages: map(input)/shuffle/reduce(output), a job is specified by input data-size, input/shuffle/output data ratio and data format. All above models for specific application types are not general enough to support multi-application-types workloads in reality.…”
Section: Related Workmentioning
confidence: 99%
“…burst or diurnal pattern), the correspondence between workload fluctuation and the variation of resource requirements. For example, [14] used KCCA (Kernel Canonical Correlation Analysis) method to predict the execution time of MapReduce jobs. And [33] studied the optimized resources scaling options based on workload variation under the premise of ensuring SLA.…”
Section: Related Workmentioning
confidence: 99%
“…More sophisticated resource estimation models including those based on machine learning techniques such as neural networks have been developed for workloads ranging from transaction oriented (i.e. OLTP) to data intensive computations [63][64][65][66]. Based on the predicted resource requirements from the observation window, the execution environment can then provision the resources for the next prediction window:…”
Section: Distributed Resource Provisioningmentioning
confidence: 99%
“…The primary challenges in developing a workload forecasting mechanism include [63][64][65][66][69][70][71]: (1) potential overheads related to change of provisioned resources as it will take time to properly set up resources before they can be used by the workload, (2) ability to accurately predict future workload behavior, and (3) ability to compute the right amount of resources required for the expected increase or decrease in workload [62]. The general framework of such a scheduling mechanism can be represented by the pseudocode below: In this mechanism, an observation window of length w is set up for the workload to collect the behavior pattern in terms of resource consumption of the workload.…”
Section: Distributed Resource Provisioningmentioning
confidence: 99%