We propose PAIO, the first general-purpose framework that enables system designers to build custom-made Software-Defined Storage (SDS) data plane stages. It provides the means to implement storage optimizations adaptable to different workflows and user-defined policies, and allows straightforward integration with existing applications and I/O layers. PAIO allows stages to be integrated with modern SDS control planes to ensure holistic control and system-wide optimal performance. We demonstrate the performance and applicability of PAIO with two use cases. The first improves 99 th percentile latency by 4× in industry-standard LSM-based key-value stores. The second ensures dynamic per-application bandwidth guarantees under shared storage environments.
IntroductionData-centric systems such as databases, key-value stores (KVS), and machine learning engines, share the need for efficient data storage and retrieval. This has led to the implementation of isolated I/O optimizations (e.g., scheduling, differentiation, caching) to address their storage requirements, such as resource fairness and throughput/latency SLOs [9,28,36]. This approach, however, has two main drawbacks. First, I/O optimizations are tightly integrated within the core of each solution, making it challenging to port these to other systems with similar performance goals. Second, in shared environments where multiple systems operate concurrently and compete for shared resources, individual optimizations can conflict with each other [15], leading to I/O contention and performance variation [29,35].The Software-Defined Storage (SDS) [21,32] paradigm promises an appealing solution to these limitations. It aims at decoupling I/O functionality into two planes: control and data. The control plane is a logically centralized entity with