2012 IEEE 28th International Conference on Data Engineering 2012
DOI: 10.1109/icde.2012.58
|View full text |Cite
|
Sign up to set email alerts
|

Load Balancing in MapReduce Based on Scalable Cardinality Estimates

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
46
0
1

Year Published

2012
2012
2020
2020

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 96 publications
(47 citation statements)
references
References 13 publications
0
46
0
1
Order By: Relevance
“…Then, the remaining keys (keygroups) of running tasks are tried to redistribute so that the capacity of the idle nodes is utilized. The approach in [5] is similar to our previous load balancing work [12] as it also relies on cardinality estimates determined during the map phase of the computation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Then, the remaining keys (keygroups) of running tasks are tried to redistribute so that the capacity of the idle nodes is utilized. The approach in [5] is similar to our previous load balancing work [12] as it also relies on cardinality estimates determined during the map phase of the computation.…”
Section: Related Workmentioning
confidence: 99%
“…In this work we use OpenCL 4 , a vendor- agnostic industry standard. The memory model as exposed to OpenCL kernels is depicted in Figure 2: An instance of a compute kernel running on a device is called a work item or simply thread 5 . Work items are combined into work groups.…”
Section: General-purpose Computing On Gpusmentioning
confidence: 99%
“…Then, the remaining keys (keygroups) of running tasks are tried to redistribute so that the capacity of the idle nodes is utilized. The approach in [7] is similar to our previous load balancing work [13] as it also relies on cardinality estimates determined during the map phase of the computation. This study as well as SkewTune are not focusing on entity resolution and cannot handle skew problems introduced by dominating blocks or key groups that need to be distributed among several reduce tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Load balancing and skew handling are well-known problems for parallel data processing but have only recently gained attention for MapReduce [21,18,19,7]. [21] presents a theoretical analysis of skew effects for MR but focuses on linear processing of entities in the reduce phase while ER has quadratic complexity to compare entities with each other.…”
Section: Related Workmentioning
confidence: 99%
“…Gufler et al [48] study the problem of handling data skew by means of an adaptive load balancing strategy. A cost estimation method is proposed to quantify the cost of the work assigned to reduce tasks, in order to ensure that this is performed fairly.…”
Section: Repartitioningmentioning
confidence: 99%