Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering 2013
DOI: 10.1145/2479871.2479906
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking approach for designing a mapreduce performance model

Abstract: In MapReduce environments, many of the programs are reused for processing a regularly incoming new data. A typical user question is how to estimate the completion time of these programs as a function of a new dataset and the cluster resources. In this work 1 , we offer a novel performance evaluation framework for answering this question. We observe that the execution of each map (reduce) tasks consists of specific, well-defined data processing phases. Only map and reduce functions are custom and their executio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
22
0
1

Year Published

2014
2014
2023
2023

Publication Types

Select...
3
3
3

Relationship

2
7

Authors

Journals

citations
Cited by 41 publications
(24 citation statements)
references
References 7 publications
1
22
0
1
Order By: Relevance
“…Counters help profile the job performance and provide important information for designing new schedulers. We utilize the extended set of counters from [6] in DyScale.…”
Section: Mapreduce Backgroundmentioning
confidence: 99%
“…Counters help profile the job performance and provide important information for designing new schedulers. We utilize the extended set of counters from [6] in DyScale.…”
Section: Mapreduce Backgroundmentioning
confidence: 99%
“…According to Zhang et al [3], "MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and costeffective analytics over 'Big Data' in the enterprise". Details on these two distributed data processing components will be discussed in the next subsections.…”
Section: Data Storage and Processingmentioning
confidence: 99%
“…These workloads continuously evolve as the user base changes, as features are activated or disabled and as user feature preferences change. Such varying field workloads often lead to load tests that are not reflective of the field [9,46], yet these workloads have a major impact on the performance of the system [15,49].…”
Section: Introductionmentioning
confidence: 99%
“…Performance analysts must determine the cause of any deviation in the counter values from the specified or expected range (e.g., response time exceeds the maximum response time permitted by the service level agreements or memory usage exceeds the average historical memory usage). These deviations may be caused by changes to the field workloads [15,49]. Such changes are common and may require performance analysts to update their load test cases [9,46].…”
Section: Introductionmentioning
confidence: 99%