2020
DOI: 10.1109/tpds.2020.2984632
|View full text |Cite
|
Sign up to set email alerts
|

The Design of Fast Content-Defined Chunking for Data Deduplication Based Storage Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 53 publications
(20 citation statements)
references
References 25 publications
0
20
0
Order By: Relevance
“…FastCDC [11] uses four optimize techniques, resulting in a huge performance improvement in chunking for the deduplication system. As Fig.…”
Section: A Backgroundmentioning
confidence: 99%
See 2 more Smart Citations
“…FastCDC [11] uses four optimize techniques, resulting in a huge performance improvement in chunking for the deduplication system. As Fig.…”
Section: A Backgroundmentioning
confidence: 99%
“…To facilitate future study of CDC algorithms, we list essential properties that are indicators of a good or bad CDC algorithm, which were used by AE and FastCDC [11], [12]:…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The earliest CDC algorithm is based on the Rabin fingerprint, 8 To pursue the chunking speed, Xia Wen et al proposed FastCDC. 9 In the fingerprint matching process, Gear fingerprint was used instead of Rabin, and two bytes were moved at a time to speed up the file traversal; at the same time, two kinds of fingerprint matching difficulties are used to increase the stability of the chunks. To reduce calculation in fingerprint matching, Bjrner Nikolaj et al proposed a new CDC algorithm based on the interval maximum value: LMC, 10 which determines to set the cut-point when the maximum value in a fixed window is in the middle of the window; the authors point out that LMC algorithm can be used to locate differential data in the application of file delta synchronization, and the feasibility of LMC is verified by theory and experiment.…”
Section: F I G U R E 1 Communication Flow Of Rsync Algorithmmentioning
confidence: 99%
“…Almost 75% data in the data world are duplicated based on survey [1], and duplicated data reaches approximately 90% in the backup data and file system [2]. The development of deduplication technology in recent years has provided an effective solution for duplicated data, and deduplication is a data compression strategy applied to storage systems with high data compression rates [51][52][53][54][55][56].…”
Section: Introductionmentioning
confidence: 99%