2014
DOI: 10.1007/978-3-642-53974-9_12
|View full text |Cite
|
Sign up to set email alerts
|

A Micro-benchmark Suite for Evaluating HDFS Operations on Modern Clusters

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
8
0
1

Year Published

2014
2014
2018
2018

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 19 publications
1
8
0
1
Order By: Relevance
“…For all tested data sizes the writing times are at least 2.2 times, both in time and throughput, slower than the reading times (column CDH Read/Write 螖) and the difference grows further with the increase of the data size. Similar behavior is observed in the results presented by Nicholas Wakou [25] (around 2.5 times slower writing times) and Islam et al [26].…”
Section: B Enhanced Dfsiosupporting
confidence: 91%
See 1 more Smart Citation
“…For all tested data sizes the writing times are at least 2.2 times, both in time and throughput, slower than the reading times (column CDH Read/Write 螖) and the difference grows further with the increase of the data size. Similar behavior is observed in the results presented by Nicholas Wakou [25] (around 2.5 times slower writing times) and Islam et al [26].…”
Section: B Enhanced Dfsiosupporting
confidence: 91%
“…To process data sizes of 240 GB, 340 GB and 440 GB, the parameters for the file sizes were fixed to 400 MB and the parameters for the number of files to read and write were set to 615, 871 and 1127. The file size and number of files were chosen based on a results presented in related work [25], [26].…”
Section: B Enhanced Dfsiomentioning
confidence: 99%
“…The skew in this computational load mainly originates from the characteristics of the map/reduce function and the input dataset, which in turn could affect the number of key/value pairs or records generated by both the map and the reduce tasks. In this paper, we are (24,24) mainly concerned with the communication characteristics of the Hadoop MapReduce workloads. Therefore, for simplicity, we assume that all processes incur a similar computational load.…”
Section: Characterization Methodologymentioning
confidence: 99%
“…3. In addition to the above benchmarks that address the Hadoop framework as a whole, several microbenchmark suites have been designed to study individual components of the Apache Hadoop framework, such as Hadoop RPC [34], and Hadoop Distributed File Systems (HDFS) [24], and particularly Hadoop MapReduce [49], an extended version of which is presented in this paper.…”
Section: Background and Related Workmentioning
confidence: 99%
“…In this section, we have evaluated our design using the HDFS microbenchmark of Sequential Write Latency (SWL) [2]. Figure 4(a) shows the performance of our design on Cluster A.…”
Section: Evaluation Using Hdfs Microbenchmarkmentioning
confidence: 99%