2015
DOI: 10.1145/2788402.2788410
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Map Reduce Job Capacity Allocation in Cloud Systems

Abstract: We are entering a Big Data world. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to provide performance for MapReduce jobs and minimize cloud resource costs. The contribution of this paper is t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
2
1

Relationship

4
4

Authors

Journals

citations
Cited by 20 publications
(23 citation statements)
references
References 24 publications
0
23
0
Order By: Relevance
“…Bardhan and Menascé [51] applied QN models for predicting the completion time of the map phase of MR jobs. Upper and lower bounds were analyticaly derived for MR job execution time in shared Hadoop clusters by authors in [52] SPNs have been used by [12] for performance prediction of adaptive Big Data architecture. Mean Field Analysis was applied by authors in [12] to obtain average performance metrics.…”
Section: Related Workmentioning
confidence: 99%
“…Bardhan and Menascé [51] applied QN models for predicting the completion time of the map phase of MR jobs. Upper and lower bounds were analyticaly derived for MR job execution time in shared Hadoop clusters by authors in [52] SPNs have been used by [12] for performance prediction of adaptive Big Data architecture. Mean Field Analysis was applied by authors in [12] to obtain average performance metrics.…”
Section: Related Workmentioning
confidence: 99%
“…The goal of such framework is to optimize the cluster resource allocation with the aim to increase cluster utilization and to reduce costs. A classical QoS performance model has been adopted to predict the response times of WS requests whereas approximate formulae (proposed in [15]) provide estimates on the execution times of MR jobs so that the latter are bounded to given deadlines. The joint AC&CA problem is formulated by means of a mathematical model, whose objective is to minimize the operational costs for running the cluster and the penalty costs incurred from request rejections.…”
Section: Mr Applications Have Evolved From Batch Analysis On Dedicatementioning
confidence: 99%
“…We consider interactive MapReduce applications which are characterized by given deadlines D i . As in [14,15], we denote with…”
Section: Allocation Of Mapreduce and Web Service Applicationsmentioning
confidence: 99%
“…Moreover, Cloud computing is also becoming a mainstream solution to provide very large clusters in a pay per use basis to support Big data applications [9]. Many cloud providers already include in their offering MapReduce based platforms (i.e., one of the most adopted framework to support large volume unstructured information processing) such as Google MapReduce framework, Microsoft HDinsight, and Amazon Elastic Compute Cloud.…”
Section: Extended Abstractmentioning
confidence: 99%
“…At the infrastructural layer, resource contentions lead to unpredictable performance [4] and additional work for resource management [5], automated VM and service migration [6] is still needed. Also networks are frequently the Cloud bottleneck and data center energy management is very critical [7].To cope with such challenges the adoption of multi-Clouds [8], has been advocated by many researchers, since deploying software on multiple Clouds overcomes single provider unavailability and allows to build cost efficient follow the sun applications.Moreover, Cloud computing is also becoming a mainstream solution to provide very large clusters in a pay per use basis to support Big data applications [9]. Many cloud providers already include in their offering MapReduce based platforms (i.e., one of the most adopted framework to support large volume unstructured information processing) such as Google MapReduce framework, Microsoft HDinsight, and Amazon Elastic Compute Cloud.…”
mentioning
confidence: 99%