2014
DOI: 10.1016/j.jss.2014.02.038
|View full text |Cite
|
Sign up to set email alerts
|

Performance models and dynamic characteristics analysis for HDFS write and read operations: A systematic view

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 18 publications
0
14
0
Order By: Relevance
“…To study the relationship between the file size and HDFS W/R performance, a file set containing 78 files with different sizes is generated. Based on prior knowledge of HDFS [2], [4], its W/R performance changes quickly and sharply for small files, and relatively turn some kind of stable as the increase of file size. Therefore, the range of size of the 78 files is from 0.25 MB to 320 MB.…”
Section: Experiments and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…To study the relationship between the file size and HDFS W/R performance, a file set containing 78 files with different sizes is generated. Based on prior knowledge of HDFS [2], [4], its W/R performance changes quickly and sharply for small files, and relatively turn some kind of stable as the increase of file size. Therefore, the range of size of the 78 files is from 0.25 MB to 320 MB.…”
Section: Experiments and Analysismentioning
confidence: 99%
“…Thirdly, analytical modeling typically depends on some simplifying assumptions on the system or workload behaviors, thus the accuracy of analytical models may be seriously challenged in scenarios when such assumptions are no longer satisfied [17]. The other measurement-based methodology (e.g., [4], [5], [18], [19]), which is to implement experimentation upon the system by running benchmarks, application programs, or specially designed data set, is a promising way to deal with the randomness since it requires no expertise of the dynamic behaviors of system or workloads. Evidently, the measurement-based methodology relies on the observed system behaviors which are always subject to various kinds of uncertainties due to external disturbance or modeling error.…”
Section: Introductionmentioning
confidence: 99%
“…A similar research work was conducted by Dong et al In the aforementioned work, the authors presented a performance characterization and modeling for read and write operations on the Hadoop Distributed File System (HDFS). The HDFS is a file system designed for huge data analysis problems, often implemented through map reduce programming models.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Saini et al 8 On previous papers, 42,43 we have approached the performance and workload characterization problem from a different perspective, focusing on understanding the impact of different parameters. In a previous work, 42 we presented an analysis on the effect of multiple parameters on the I/O performance of write operations on a PFS, whereas in another work, 43 A similar research work was conducted by Dong et al 44 In the aforementioned work, 44 the authors presented a performance characterization and modeling for read and write operations on the Hadoop Distributed File System (HDFS). The HDFS is a file system designed for huge data analysis problems, often implemented through map reduce programming models.…”
Section: Performance and Workload Characterizationmentioning
confidence: 99%
“…Hadoop is a representative open-source software framework for the distributed storage and processing of large amounts of data. As Hadoop stores and processes large amounts of data in the disks of distributed nodes, continuous disk input and output occur, making real-time processing impossible [9][10][11]. Furthermore, when input and output are concentrated on a specific node, a bottleneck occurs, and the overall processing speed is lowered.…”
Section: Introductionmentioning
confidence: 99%