2016
DOI: 10.1007/978-3-319-39577-7_11
|View full text |Cite
|
Sign up to set email alerts
|

Self-Balancing Job Parallelism and Throughput in Hadoop

Abstract: In Hadoop cluster, the performance and the resource consumption of MapReduce jobs do not only depend on the characteristics of these applications and workloads, but also on the appropriate setting of Hadoop configuration parameters. However, when the job workloads are not known a priori or they evolve over time, a static configuration may quickly lead to a waste of computing resources and consequently to a performance degradation. In this paper, we therefore propose an on-line approach that dynamically reconfi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 23 publications
0
5
0
Order By: Relevance
“…This is one of the crucial parameter whose inappropriate configuration can have a serious impact on Hadoop performance (up to 40%, cf. Zhang et al [11]).…”
Section: Case Studymentioning
confidence: 99%
See 3 more Smart Citations
“…This is one of the crucial parameter whose inappropriate configuration can have a serious impact on Hadoop performance (up to 40%, cf. Zhang et al [11]).…”
Section: Case Studymentioning
confidence: 99%
“…For this task, we rely on docker-machine 10 , a tool that creates virtual machines and install Docker engine on them. It currently supports creating virtual machines in local machine using either VirtualBox or VMWare, in local cluster, reusing existing machines simply via SSH, or in number of cloud providers 11 including all the major vendors. The main advantage of using Docker Machine is in the layer of abstraction it provides that can form the very same Hadoop cluster regardless the actual virtualization and cloud environment being used.…”
Section: Hadoop Benchmark Platformmentioning
confidence: 99%
See 2 more Smart Citations
“…They improved Hadoop's name node performance and proposed an optimization framework. Zhang et al [13] dynamically reconfigured Hadoop for realtime performance, outperforming the vanilla Hadoop. Lo and Cheng [14] proposed a modified fair scheduler that dynamically adjusts the resource allocation for user jobs and reduces the average turnaround time.…”
Section: Significance Of Studymentioning
confidence: 99%