Very large block-level data backup systems need scalable data deduplication and garbage collection techniques to make efficient use of the storage space and to minimize the performance overhead of doing so. Although the deduplication and garbage collection logic is conceptually straightforward, their implementations pose a significant technical challenge because only a small portion of their associated data structures could fit into memory. In this paper, we describe the design, implementation and evaluation of a data deduplication and garbage collection engine called Sungem that is designed to remove duplicate blocks in incremental data backup streams. Sungem features three novel techniques to maximize the deduplication throughput without compromising the deduplication ratio. First, Sungem puts related fingerprint sequences, rather than fingerprints from the same backup stream, into the same container in order to increase the fingerprint prefetching efficiency. Second, to make the most of the memory space reserved for storing fingerprints, Sungem varies the sampling rates for fingerprint sequences based on their stability. Third, Sungem combines reference count and expiration time in a unique way to arrive at the first known incremental garbage collection algorithm whose bookkeeping overhead is proportional to the size of a disk volume's incremental backup snapshot rather than its full backup snapshot. We evaluated the Sungem prototype using a real-world data backup trace, and showed that the average throughput of Sungem is more than 200,000 fingerprint lookups per second on a standard X86 server, including the garbage collection cost.
Traditional storage systems provide a simple read/write interface, which is inadequate for low-locality update-intensive workloads because it limits the disk scheduling flexibility and results in inefficient use of buffer memory and raw disk bandwidth. This paper describes an update-aware disk access interface that allows applications to explicitly specify disk update requests and associate with such requests call-back functions that will be invoked when the requested disk blocks are brought into memory. Because call-back functions offer a continuation mechanism after retrieval of requested blocks, storage systems supporting this interface are given more flexibility in scheduling pending disk update requests. In particular, this interface enables a simple but effective technique called Batching mOdifications with Sequential Commit (BOSC), which greatly improves the sustained throughput of a storage system under low-locality update-intensive workloads. In addition, together with a space-efficient low-latency disk logging technique, BOSC is able to deliver the same durability guarantee as synchronous disk updates. Empirical measurements show that the random update throughput of a BOSC-based B+ tree is more than an order of magnitude higher than that of the same B+ tree implementation on a traditional storage system.
Traditional storage systems provide a simple read/write interface, which is inadequate for low-locality update-intensive workloads because it limits the disk scheduling flexibility and results in inefficient use of buffer memory and raw disk bandwidth. This paper describes an update-aware disk access interface that allows applications to explicitly specify disk update requests and associate with such requests call-back functions that will be invoked when the requested disk blocks are brought into memory. Because callback functions offer a continuation mechanism after retrieval of requested blocks, storage systems supporting this interface are given more flexibility in scheduling pending disk update requests. In particular, this interface enables a simple but effective technique called Batching mOdifications with Sequential Commit (BOSC), which greatly improves the sustained throughput of a storage system under low-locality update-intensive workloads. In addition, together with a space-efficient low-latency disk logging technique, BOSC is able to deliver the same durability guarantee as synchronous disk updates. Empirical measurements show that the random update throughput of a BOSC-based B + tree is more than an order of magnitude higher than that of the same B + tree implementation on a traditional storage system.
Synchronously logging updates to persistent storage first and then asynchronously committing these updates to their rightful storage locations is a well-known and heavily used technique to improve the sustained throughput of write-intensive disk-based data processing systems, whose latency and throughput accordingly are largely determined by the latency and throughput of the underlying logging mechanism. The conventional wisdom is that logging operations are relatively straightforward to optimize because the associated disk access pattern is largely sequential. However, it turns out that to achieve both high throughput and low latency for fine-grained logging operations, whose payload size is smaller than a disk sector, is extremely challenging. This paper describes the experiences and lessons we have gained from building a disk logging system that can successfully deliver over 1.2 million 256-byte logging operations per second, with the average logging latency below 1 msec.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.