Cloud Computing for Data-Intensive Applications 2014
DOI: 10.1007/978-1-4939-1905-5_12
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Phase Optimization in MapReduce

Abstract: Abstract-MapReduce has been designed to accommodate large-scale data-intensive workloads running on large singlesite homogeneous clusters. Researchers have begun to explore the extent to which the original MapReduce assumptions can be relaxed including skewed workloads, iterative applications, and heterogeneous computing environments. Our work continues this exploration by applying MapReduce across widely distributed data over distributed computation resources. This problem arises when datasets are generated a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
9
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 20 publications
0
9
0
Order By: Relevance
“…Hierarchical MapReduce [14] is concerned with compute-intensive MapReduce applications and how to apply multiple distributed clusters to them, but uses clusters and not edge resources. Our recent work [9] is focused more on cross-phase MapReduce optimization, albeit in a wide-area setting. Estimating network paths and forecasting future network conditions are addressed by projects such as NWS [20].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Hierarchical MapReduce [14] is concerned with compute-intensive MapReduce applications and how to apply multiple distributed clusters to them, but uses clusters and not edge resources. Our recent work [9] is focused more on cross-phase MapReduce optimization, albeit in a wide-area setting. Estimating network paths and forecasting future network conditions are addressed by projects such as NWS [20].…”
Section: Related Workmentioning
confidence: 99%
“…There are a number of relevant distributed MapReduce projects in the literature [12], [14], [9]. Moon [12] is focused on voluntary resources but not in a wide-area setting.…”
Section: Related Workmentioning
confidence: 99%
“…In [14] the authors focus on the problem of how map and reduce processes are split over hybrid cloud platforms. Heintz et al [15] shows strategies for scheduling map tasks over distributed cloud platforms in order to minimize the performance degradation that can result from large communication times between the platforms. Our own previous work introduces a proposal for data migration (asynchronous rebalancing of the MapReduce storage layer) and scheduling strategies (enforced locality to avoid data pulls over the weak link) that are specifically designed to address the weak link bottleneck in hybrid cloud bursting [2], [16].…”
Section: Related Workmentioning
confidence: 99%
“…TIME FOR THE ORIGINAL (3) AND EXTENDED(15) ON-PREMISE ONLY BASELINE SCENARIOS No HDFS Off-Premise: the MapReduce runtime makes use of the off-premise VMs but HDFS is deployed only on-premise; (2) Blocking Rebalance: HDFS is deployed on the off-premise VMs, the data blocks are rebalanced and then, after this process finished, the MapReduce application is executed; (3) Plain Asynchronous Rebalance: HDFS is deployed on the off-premise VMs and the rebalancing starts in the background at the same time as the application with lo-…”
mentioning
confidence: 99%
“…This has led to the emergence of a number of distributed computing frameworks such as Hadoop [5], Dryad [6], Pregel [7], and others [8], [9]. Although these computing frameworks were originally designed for processing applications in a cluster environment, researchers have also looked at adapting and optimizing them in a geo-distributed environment [10]- [12]. We believe that the growth of geo-distributed data and applications that operate on widely distributed data will trigger more computing frameworks to be developed or adapted for a geo-distributed system.…”
Section: Introductionmentioning
confidence: 99%