2023
DOI: 10.1109/access.2023.3247190
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Replication Policy on HDFS Based on Machine Learning Clustering

Abstract: Data growth in recent years has been swift, leading to the emergence of big data science. Distributed File Systems (DFS) are commonly used to handle big data, like Google File System (GFS), Hadoop Distributed File System (HDFS), and others. The DFS should provide the availability of data and reliability of the system in case of failure. The DFS replicates the files in different locations to provide availability and reliability. These replications consume storage space and other resources. The importance of the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 25 publications
0
1
0
Order By: Relevance
“…They implemented a heuristic model that uses restore techniques and checkpoints, supported by a statistical model that predicts the revocation time by defining the best checkpoint interval and scrutinizing the price change. Ahmed et al 139 integrates reactive fault tolerance approaches, that is, replication with adaptive such that machine learning to minimize storage usage, optimize read and write operations, and maintain the availability and reliability of the system. The paper uses machine learning to cluster groups and applied dynamic replication policy on each group to enhance their availability.…”
Section: Taxonomy Of Fault Tolerance Approachesmentioning
confidence: 99%
“…They implemented a heuristic model that uses restore techniques and checkpoints, supported by a statistical model that predicts the revocation time by defining the best checkpoint interval and scrutinizing the price change. Ahmed et al 139 integrates reactive fault tolerance approaches, that is, replication with adaptive such that machine learning to minimize storage usage, optimize read and write operations, and maintain the availability and reliability of the system. The paper uses machine learning to cluster groups and applied dynamic replication policy on each group to enhance their availability.…”
Section: Taxonomy Of Fault Tolerance Approachesmentioning
confidence: 99%