2014
DOI: 10.1145/2512348
|View full text |Cite
|
Sign up to set email alerts
|

Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud

Abstract: Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM (virtual machine) platforms. However, the performance of restore operations from a deduplicated backup can be significantly lower than that without deduplication. The main reason lies in the fact that a file or block is split into multiple small data chunks that are often located in different disks… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 55 publications
(22 citation statements)
references
References 19 publications
0
22
0
Order By: Relevance
“…Some other impends of cost curtailment may include supervising the data redundancy in the cloud cluster. As in [28,29] different algorithms have been proposed for intermittently and nonintermittently used data.…”
Section: Cloudthingsmentioning
confidence: 99%
“…Some other impends of cost curtailment may include supervising the data redundancy in the cloud cluster. As in [28,29] different algorithms have been proposed for intermittently and nonintermittently used data.…”
Section: Cloudthingsmentioning
confidence: 99%
“…Thus, a subsequent read request to the data will incur many small disk I/O operations, a phenomenon we call read amplification phenomenon [25]. Figure 4 shows the processing workflow in a traditional deduplication storage system.…”
Section: The Read Amplification Problemmentioning
confidence: 99%
“…In SSD/HDD hybrid storage systems, Yongseok et al [30] proposed a dynamic scheme to divide the flash memory cache to cache space and over-provisioned space, thus providing better cache performance and GC efficiency inside SSDs. SAR [24], [25] and Nitro [21] combines the flash-based SSD and data deduplication to improve the performance of primary storage systems. The iCache is inspired by these previous studies and designed to collaborate with Select-Dedupe to further eliminate the redundant write requests and address the read amplification problem in primary storage systems.…”
Section: Data Deduplicationmentioning
confidence: 99%
“…Cloud storage optimization [16][17][18][19][20][21][22] is another issue to improve the quality of service of the cloud computing systems. In [16], optimizations are proposed to reduce the volume of data to be transferred per data access for the respects of privacy and security, which did not consider the storage cost.…”
Section: Related Workmentioning
confidence: 99%