2022
DOI: 10.1145/3476831
|View full text |Cite
|
Sign up to set email alerts
|

BurstZ+: Eliminating The Communication Bottleneck of Scientific Computing Accelerators via Accelerated Compression

Abstract: We present BurstZ+, an accelerator platform that eliminates the communication bottleneck between PCIe-attached scientific computing accelerators and their host servers, via hardware-optimized compression. While accelerators such as GPUs and FPGAs provide enormous computing capabilities, their effectiveness quickly deteriorates once data is larger than its on-board memory capacity, and performance becomes limited by the communication bandwidth of moving data between the host memory and accelerator. Compression … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 63 publications
0
3
0
Order By: Relevance
“…Sun et al [21], to appear in 2022, tackle the same bandwidth problem as ours by using a mix of compression and data layout. Although this approach does not rely on polyhedral dependency analysis, it features the same base idea: group together data that is being used together.…”
Section: Column Major Row Major Data Tiling + Row Majormentioning
confidence: 99%
“…Sun et al [21], to appear in 2022, tackle the same bandwidth problem as ours by using a mix of compression and data layout. Although this approach does not rely on polyhedral dependency analysis, it features the same base idea: group together data that is being used together.…”
Section: Column Major Row Major Data Tiling + Row Majormentioning
confidence: 99%
“…Nonetheless, tiling in the on-chip memory of accelerators is not mentioned. Sun et al [28], [29] presented compression-based approaches to address the performance bottleneck imposed by data transfer during executing stencil tasks in accelerator-based (e.g., CPU-GPU and CPU-FPGA) systems. Their work can be leveraged in combination with ours to further enhance the performance.…”
Section: Related Workmentioning
confidence: 99%
“…Sun et al [19] proposed an accelerator platform that eliminates the data movement bottleneck between PCIe-attached FPGAs and their host servers via compression. Their approach mainly focuses on optimizing the ZFP compression algorithm [20] on a hardware (i.e.…”
Section: Related Workmentioning
confidence: 99%