2019 9th Latin-American Symposium on Dependable Computing (LADC) 2019
DOI: 10.1109/ladc48089.2019.8995674
|View full text |Cite
|
Sign up to set email alerts
|

Improving Data Availability in HDFS through Replica Balancing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 3 publications
0
9
0
1
Order By: Relevance
“…• initial RPP, in which the replicas are placed immediately after writing the files based on the standard Replica Placement Policy of the HDFS (i.e., without reactive balancing), as presented in Section 2.1; • datanode, in which the redistribution of the blocks is done by the HDFS Balancer configured with the default balancing policy (Section 4); • blockpool, in which the rearrangement of the replicas is performed by HDFS Balancer configured with the blockpool policy (Section 4); and • custom, in which the replica balancing process in the file system is conducted by the HDFS Balancer customized with the "data availability" priority presented in [Fazul et al 2019], as pointed out in Section 5.…”
Section: Experiments and Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…• initial RPP, in which the replicas are placed immediately after writing the files based on the standard Replica Placement Policy of the HDFS (i.e., without reactive balancing), as presented in Section 2.1; • datanode, in which the redistribution of the blocks is done by the HDFS Balancer configured with the default balancing policy (Section 4); • blockpool, in which the rearrangement of the replicas is performed by HDFS Balancer configured with the blockpool policy (Section 4); and • custom, in which the replica balancing process in the file system is conducted by the HDFS Balancer customized with the "data availability" priority presented in [Fazul et al 2019], as pointed out in Section 5.…”
Section: Experiments and Discussionmentioning
confidence: 99%
“…In previous work [Fazul et al 2019], we proposed a customized balancing policy for the HDFS Balancer, which focuses on improving data availability and performance through replica balancing. To this end, a balancing priority, called "data availability", has been incorporated into the block move scheduling step executed in each balancing iteration to prioritize block movements that increase the availability of the data stored in the HDFS, that is, place the replicas in as many racks as possible.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…IV. CUSTOMIZED REPLICA BALANCING POLICY Previous work attested that replicas imbalance directly affect the HDFS performance in serving I/O bound applications [12]. Motivated by the achieved results, we defined a customized policy for the HDFS Balancer dedicated to optimizing the default operation policy of the Hadoop's native balancer.…”
Section: A Hdfs Balancermentioning
confidence: 99%
“…Este trabalho destina-seà investigação do comportamento da política customizada considerando o uso simultâneo de duas prioridades: "confiabilidade dos racks" e "disponibilidade dos dados". Em trabalhos anteriores, as prioridades de confiabilidade e disponibilidade foram avaliadas de forma totalmente independente [Fazul et al 2019a, Fazul et al 2019b. A associação destas prioridades, por sua vez,é algo inédito na política customizada e abre um conjunto de novas otimizações para o balanceamento de réplicas, apresentando-se assim como a contribuição principal do presente artigo.…”
Section: Distribuição Dos Dados Disponibilidade Dos Dadosunclassified