2019 IEEE/ACM International Workshop on Heterogeneous High-Performance Reconfigurable Computing (H2RC) 2019
DOI: 10.1109/h2rc49586.2019.00007
|View full text |Cite
|
Sign up to set email alerts
|

The Memory Controller Wall: Benchmarking the Intel FPGA SDK for OpenCL Memory Interface

Abstract: Supported by their high power efficiency and recent advancements in High Level Synthesis (HLS), FPGAs are quickly finding their way into HPC and cloud systems. Large amounts of work have been done so far on loop and area optimizations for different applications on FPGAs using HLS. However, a comprehensive analysis of the behavior and efficiency of the memory controller of FPGAs is missing in literature, which becomes even more crucial when the limited memory bandwidth of modern FPGAs compared to their GPU coun… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…First, benchmarking traditional memory on FPGAs. Previous work [13], [14], [15] tries to benchmark traditional memory, e.g., DDR3, on the FPGA by using high-level languages, e.g., OpenCL. In contrast, we benchmark HBM on the stateof-the-art FPGA.…”
Section: Related Workmentioning
confidence: 99%
“…First, benchmarking traditional memory on FPGAs. Previous work [13], [14], [15] tries to benchmark traditional memory, e.g., DDR3, on the FPGA by using high-level languages, e.g., OpenCL. In contrast, we benchmark HBM on the stateof-the-art FPGA.…”
Section: Related Workmentioning
confidence: 99%
“…In HPC, the memory wall is one of the main limitations of FPGAs for applications. The memory requires a controller to reorder requests to minimize row conflicts, and as a consequence the throughput depends on memory controller implementation [15], [16]. The behaviour of memory controllers is often overlooked [17], [18] or simplified as in the performance model proposed by Wang et al [8] for Intel OpenCL SDK.…”
Section: State Of the Artmentioning
confidence: 99%
“…FlexCL improves models covering memory access patterns with a short CPU/GPU execution, but it continues being the main source of error of the model. As some comparisons show, the memory controller makes differences in the access pattern and hence performance [15], [19]- [21]; moreover, CPU/GPU devices have a more sophisticated memory hierarchy that can hide DRAM latency. As well as the memory controller, the memory standard or technology changes the interaction with the FPGA pipeline.…”
Section: State Of the Artmentioning
confidence: 99%
“…II. RELATED WORKS 148Several research works have investigated FPGAs perfor-149 mance when used as hardware accelerators, mostly using 150 synthetic benchmarks to estimate the bandwidth of off-chip 151 memories[26],[27],[28], and OpenCL kernels to measure 152 the FPGA computing performance[29],[30],[31]. However, 153 only few tools utilize the Roofline Model, and none assess 154 also the on-chip memories bandwidth.155In[26] is presented the Shuhai Verilog benchmark, 156 used to characterize the performance of HBM and DDR 157 off-chip memories embedded in the Xilinx Alveo U280.…”
mentioning
confidence: 99%