Data integrity is a fundamental aspect of storage security and reliability. With the advent of network storage and new technology trends that result in new failure modes for storage, interesting challenges arise in ensuring data integrity. In this paper, we discuss the causes of integrity violations in storage and present a survey of integrity assurance techniques that exist today. We describe several interesting applications of storage integrity checking, apart from security, and discuss the implementation issues associated with techniques. Based on our analysis, we discuss the choices and tradeoffs associated with each mechanism. We then identify and formalize a new class of integrity assurance techniques that involve logical redundancy. We describe how logical redundancy can be used in today's systems to perform efficient and seamless integrity assurance.
An organization's data is often its most valuable asset, but today's file systems provide few facilities to ensure its safety. Databases, on the other hand, have long provided transactions. Transactions are useful because they provide atomicity, consistency, isolation, and durability (ACID). Many applications could make use of these semantics, but databases have a wide variety of non-standard interfaces. For example, applications like mail servers currently perform elaborate error handling to ensure atomicity and consistency, because it is easier than using a DBMS. A transaction-oriented programming model eliminates complex error-handling code, because failed operations can simply be aborted without side effects. We have designed a file system that exports ACID transactions to user-level applications, while preserving the ubiquitous and convenient POSIX interface. In our prototype ACID file system, called Amino, updated applications can protect arbitrary sequences of system calls within a transaction. Unmodified applications operate without any changes, but each system call is transaction protected. We also built a recoverable memory library with support for nested transactions to allow applications to keep their in-memory data structures consistent with the file system. Our performance evaluation shows that ACID semantics can be added to applications with acceptable overheads. When Amino adds atomicity, consistency, and isolation functionality to an application, it performs close to Ext3. Amino achieves durability up to 46% faster than Ext3, thanks to improved locality.
Developing file systems from scratch is difficult and error prone. Layered, or stackable, file systems are a powerful technique to incrementally extend the functionality of existing file systems on commodity OSes at runtime. In this paper, we analyze the evolution of layering from historical models to what is found in four different present day commodity OSes: Solaris, FreeBSD, Linux, and Microsoft Windows. We classify layered file systems into five types based on their functionality and identify the requirements that each class imposes on the OS. We then present five major design issues that we encountered during our experience of developing over twenty layered file systems on four OSes. We discuss how we have addressed each of these issues on current OSes, and present insights into useful OS and VFS features that would provide future developers more versatile solutions for incremental file system development.
Developing kernel-level file systems is a difficult task that requires a significant time investment. For experimental file systems, it is desirable to develop a prototype before investing the time required to develop a kernel-level file system. We have built a ptrace monitoring infrastructure for file system development. Because our system runs entirely in user-space, debugging is made easier and it is possible to leverage existing tested user-level libraries. Because our monitor intercepts all OS entry points (system calls and signals) it is able to provide more functionality than other prototyping techniques, which are limited by the VFS interface (FUSE) or network protocols (user-level NFS servers). We have developed several example file systems using our framework, including a pass-through layered file system, a layered encryption file system, and a userlevel ISO9660 file system. We analyzed the complexity of our code using cyclomatic complexity and other metrics. We show savings for a pass-through file system of 53% compared to existing userlevel pass-through file systems and a factor of 4.7 reduction for an in-kernel pass-through file system. Our performance evaluation demonstrates that our infrastructure has an acceptable overhead of 18.4% for a pass-through file system.
Data recoverability in the face of partial disk errors is an important prerequisite in modern storage. We have designed and implemented a prototype disk system that automatically ensures the integrity of stored data, and transparently recovers vital data in the event of integrity violations. We show that by using pointer knowledge, effective integrity assurance can be performed inside a block-based disk with negligible performance overheads. We also show how semantics-aware replication of blocks can help improve the recoverability of data in the event of partial disk errors with small space overheads. Our evaluation results show that for normal user workloads, our disk system has a performance overhead of only 1-5% compared to traditional disks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.