We live in a data-centric world where we are heading to generate close to 200 Zettabytes of data by the year 2025 [56]. Our data processing requirements have also increased as we push to build data processing frameworks that can process large volumes of data in a short duration, a few milli-and even micro-seconds. In the prevalent computer systems designs, data is stored passively in storage devices which is brought in for processing and then the results are written out. As the volume of data explodes this constant data movement has led to a data movement wall which hinders further process and optimizations in data processing systems designs. One promising alternative to this architecture is to push computation to the data (instead of the other way around), and design a computational-storage device or CSD. The idea of CSD is not new and can trace its root to the pioneering work done in the 1970s and 1990s. More recently, with the emergence of non-volatile memory (NVM) storage in the mainstream computing (e.g., NAND flash and Optane), the idea has again gained a lot of traction with multiple academic and commercial prototypes being available now. In this brief survey we present a systematic analysis of work done in the area of computation storage and present future directions.
The Big Data trend is putting strain on modern storage systems, which have to support high-performance I/O accesses for the large quantities of data. With the prevalent Von Neumann computing architecture, this data is constantly moved back and forth between the computing (i.e., CPU) and storage entities (DRAM, Non-Volatile Memory NVM storage). Hence, as the data volume grows, this constant data movement between the CPU and storage devices has emerged as a key performance bottleneck. To improve the situation, researchers have advocated to leverage computational storage devices (CSDs), which offer a programmable interface to run userdefined data processing operations close to the storage without excessive data movement, thus offering performance improvements. However, despite its potential, building CSD-aware applications remains a challenging task due to the lack of exploration and experimentations with the right API and abstraction. This is due to the limited accessibility to latest CSD/NVM devices, emerging device interfaces, and closed-source software internals of the devices. To remedy the situation, in this work we present an open-source CSD prototype over emerging NVMe Zoned Namespaces (ZNS) SSDs and an interface that can be used to explore application designs for CSD/NVM storage devices. In this paper we summarize the current state of the practice with CSD devices, make a case for designing a CSD prototype with the ZNS interface and eBPF (ZCSD), and present our initial findings. The prototype is available at https://github.com/Dantali0n/qemu-csd.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.