2010
DOI: 10.1145/1740390.1740405
|View full text |Cite
|
Sign up to set email alerts
|

On the energy (in)efficiency of Hadoop clusters

Abstract: Distributed processing frameworks, such as Yahoo!'s Hadoop and Google's MapReduce, have been successful at harnessing expansive datacenter resources for large-scale data analysis. However, their effect on datacenter energy efficiency has not been scrutinized. Moreover, the filesystem component of these frameworks effectively precludes scale-down of clusters deploying these frameworks (i.e. operating at reduced capacity). This paper presents our early work on modifying Hadoop to allow scale-down of operational … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
164
0
4

Year Published

2011
2011
2018
2018

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 273 publications
(168 citation statements)
references
References 6 publications
0
164
0
4
Order By: Relevance
“…For example, studies on how to predict MapReduce job running times [20], [21] can evaluate their mechanisms on realistic job mixes. Studies on MapReduce energy efficiency [22], [23] can quantify energy savings under realistic workload fluctuations. Various efforts to develop effective MapReduce workload management schemes [24], [7] can generalize their findings across a different realistic workloads.…”
Section: Towards Mapreduce Workload Suitesmentioning
confidence: 99%
“…For example, studies on how to predict MapReduce job running times [20], [21] can evaluate their mechanisms on realistic job mixes. Studies on MapReduce energy efficiency [22], [23] can quantify energy savings under realistic workload fluctuations. Various efforts to develop effective MapReduce workload management schemes [24], [7] can generalize their findings across a different realistic workloads.…”
Section: Towards Mapreduce Workload Suitesmentioning
confidence: 99%
“…Each possible solution has its own business tradeoffs. There is significant research work in progress in industry and academia to address this problem, but many challenges still remain [3,18,20,26].…”
Section: Other Issuesmentioning
confidence: 99%
“…The initial work on MapReduce cluster energy management was presented in [17] based on covering subset (CS). In that work, the CS nodes are manually determined, and one replica for each data item is then placed in one of the CS nodes.…”
Section: ) Mapreduce Cluster Energy Managementmentioning
confidence: 99%
“…definition in [17]. The CS used here is not a static node set, rather it is discovered on demand based on a given list of data blocks required for computation.…”
Section: Node Set Discovery Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation