We present BurstZ+, an accelerator platform that eliminates the communication bottleneck between PCIe-attached scientific computing accelerators and their host servers, via hardware-optimized compression. While accelerators such as GPUs and FPGAs provide enormous computing capabilities, their effectiveness quickly deteriorates once data is larger than its on-board memory capacity, and performance becomes limited by the communication bandwidth of moving data between the host memory and accelerator. Compression has not been very useful in solving this issue due to performance and efficiency issues of compressing floating point numbers, which scientific data often consists of. BurstZ+ is an FPGA-based prototype accelerator platform which addresses the bandwidth issue via a class of novel hardware-optimized floating point compression algorithm called ZFP-V. We demonstrate that BurstZ+ can completely remove the host-side communication bottleneck for accelerators, using multiple stencil kernels with a wide range of operational intensities. Evaluated against hand-optimized implementations of kernel accelerators of the same architecture, our single-pipeline BurstZ+ prototype outperforms an accelerator without compression by almost 4×, and even an accelerator with enough memory for the entire dataset by over 2×. Furthermore, the projected performance of BurstZ+ on a future, faster FPGA scales to almost 7× that of the same accelerator without compression, whose performance is still limited by the PCIe bandwidth.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.