2022
DOI: 10.1145/3565025
|View full text |Cite
|
Sign up to set email alerts
|

The what , The from , and The to : The Migration Games in Deduplicated Systems

Abstract: Deduplication reduces the size of the data stored in large-scale storage systems by replacing duplicate data blocks with references to their unique copies. This creates dependencies between files that contain similar content, and complicates the management of data in the system. In this paper, we address the problem of data migration, where files are remapped between different volumes as a result of system expansion or maintenance. The challenge of determining which files and blocks to migrate has been studied… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

1
2
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 23 publications
1
2
0
Order By: Relevance
“…These two policies are in reality the same as our emphasized space efficiency rationale. As for migration traffic [44], i.e., the amount of data that is moved across servers, we think it is implicitly consistent with our space efficiency rationale. The intrinsic reason is that replicating a block means transmitting this replica across the network, leading to more migration traffic.…”
Section: Discussionsupporting
confidence: 63%
See 2 more Smart Citations
“…These two policies are in reality the same as our emphasized space efficiency rationale. As for migration traffic [44], i.e., the amount of data that is moved across servers, we think it is implicitly consistent with our space efficiency rationale. The intrinsic reason is that replicating a block means transmitting this replica across the network, leading to more migration traffic.…”
Section: Discussionsupporting
confidence: 63%
“…It is because the generated data replicas in our work provide the opportunity for block copies to work when a block suffers from hardware failure or software crash. Load balance [7], [44], [45] is a major concentration in distributed storage systems, which often conflicts with our space efficiency rationale. To be specific, the system's space cost can be minimized by mapping all files to a single server, which enables detection and deletion of all duplicate blocks.…”
Section: Discussionmentioning
confidence: 95%
See 1 more Smart Citation