2014
DOI: 10.5120/ijais14-451261
|View full text |Cite
|
Sign up to set email alerts
|

Comparative Study Load Balance Algorithms for Map Reduce Environment

Abstract: MapReduce is a famous model for data-intensive parallel computing in shared-nothing clusters. One of the main issues in MapReduce is the fact of depending its performance mainly on data distribution. MapReduce contains simple load balance technique based on FIFO job scheduler that serves the jobs in their submission order but unfortunately it is insufficient in real world cases as it missed many factors that impact the performance such as heterogeneity factor and data skewness, so Load balancing is important t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…Partitioning for Map-Reduce Partitioning has been widely studied for Map-Reduce-based processing [7,22,23,25]. While conceptually similar, these approaches either require offline preprocessing of the data and, thus, are not suitable with optimize solely for the map or the reduce phase.…”
Section: Related Workmentioning
confidence: 99%
“…Partitioning for Map-Reduce Partitioning has been widely studied for Map-Reduce-based processing [7,22,23,25]. While conceptually similar, these approaches either require offline preprocessing of the data and, thus, are not suitable with optimize solely for the map or the reduce phase.…”
Section: Related Workmentioning
confidence: 99%
“…The next MapReduce job reads the intermediate results of the previous job to continue processing. The HDFS I/O cost is significantly higher than local storage (i.e., there is Network cost) that use load balance [22,23]. So, exploiting shared jobs can reduce intermediate results, and can be cheaper than generating too large size of intermediate results in the case of using an original data source for each query separately [24,25].…”
Section: Mapreduce Query Processingmentioning
confidence: 99%