2012
DOI: 10.1016/j.ipl.2012.01.012
|View full text |Cite
|
Sign up to set email alerts
|

Hash challenges: Stretching the limits of compare-by-hash in distributed data deduplication

Abstract: We propose a technique for reducing communication overheads when sending data across a network. Our technique, called hash challenges, leverages existing deduplication solutions based on compare-by-hash by being able to determine redundant data chunks by exchanging substantially less meta-data. Hash challenges can be used directly on any existing compare-by-hash protocol, with no relevant additional computational complexity. Using real data from reference workloads, we show that hash challenges can save as muc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 11 publications
0
6
0
Order By: Relevance
“…JoaoBarreto [12] proposed an alternative technique to CBH called Hash Challenges(HCs). Using HCs, he tried to find the redundant chunks by using hash fragments instead of their complete hashes.…”
Section: Related Workmentioning
confidence: 99%
“…JoaoBarreto [12] proposed an alternative technique to CBH called Hash Challenges(HCs). Using HCs, he tried to find the redundant chunks by using hash fragments instead of their complete hashes.…”
Section: Related Workmentioning
confidence: 99%
“…For example SHA-1 (20 byte) and MD5 (16 byte) hash function is used. Joao Barreto [11] used SHA-1 algorithm to compute hashes for the chunks. Though SHA-1 avoids hash collision, it is more CPU intensive than MD5 hash algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…Compare-by-Hash method to reduce lookup latency and communication overhead across the network [11]. The Deduplication is performed by comparing k-byte prefix of each hash.…”
Section: Barreto Proposed Hash Challenges [Hc] Which Is Similar Tomentioning
confidence: 99%
“…The very big challenge in handling the image deduplication process 1,17–20 in cloud storage space is time and space complexity. The time required to search the existence of duplicate copy and memory required to store all these hash values is tremendously high which often leads to additional overhead for the cloud users.…”
Section: Introductionmentioning
confidence: 99%