2012
DOI: 10.1007/978-1-4614-2326-3_20
|View full text |Cite
|
Sign up to set email alerts
|

The Partition Cost Model for Load Balancing in MapReduce

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 13 publications
0
5
0
Order By: Relevance
“…Most existing works [2][3][4]20,21] only target the partitioning skew and neglect the computational skew that can arise in both the map and reduce stages. Moreover, a common approach is adopted to these solutions that predicts and then redistributes the task load to achieve a better balance, which requires additional (sometimes heavy) overhead in terms of key distribution sampling and load reassignment.…”
Section: Resource Management In Hadoop Yarnmentioning
confidence: 99%
See 2 more Smart Citations
“…Most existing works [2][3][4]20,21] only target the partitioning skew and neglect the computational skew that can arise in both the map and reduce stages. Moreover, a common approach is adopted to these solutions that predicts and then redistributes the task load to achieve a better balance, which requires additional (sometimes heavy) overhead in terms of key distribution sampling and load reassignment.…”
Section: Resource Management In Hadoop Yarnmentioning
confidence: 99%
“…The CPU resources allocated to a task are determined by the number of vCores allocated to the task. Memory allocation, by contrast, is controlled by two configurations: Logical RAM limit and maximum JVM heap size limit 2 . The former is a unit used to manage the resources logically, while the latter setting reflects the maximum heap size of the JVM that runs the task.…”
Section: Impact Of Resources On Task Running Timementioning
confidence: 99%
See 1 more Smart Citation
“…Skew in MapReduce. The first related solutions mitigate reducer skew by measuring the key distribution during the Map operation [33,13]. In general, these methods are not appropriate for long-running streaming tasks with concept drifts in key distribution, nor for stateful operators that require state migration after repartitioning.…”
Section: Related Workmentioning
confidence: 99%
“…End Figure 2: Flow chart of short reads gene sequence parallel alignment Partitioner is a means of data distribution provided by Hadoop platform [10]. The partition classes built in the platform, such as HashP artitioner and BinaryP artitioner, are not suitable for the distribution of pair-end sequences.…”
Section: Algorithm 1 the Map Algorithmmentioning
confidence: 99%