Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2022
DOI: 10.1145/3490422.3502369
|View full text |Cite
|
Sign up to set email alerts
|

HeteroFlow

Abstract: To achieve high performance with FPGA-equipped heterogeneous compute systems, it is crucial to co-optimize data placement and compute scheduling to maximize data reuse and bandwidth utilization for both on-and off-chip memory accesses. However, optimizing the data placement for FPGA accelerators is a complex task. One must acquire in-depth knowledge of the target FPGA device and its associated memory system in order to apply a set of advanced optimizations. Even with the latest high-level synthesis (HLS) tools… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 25 publications
0
5
0
Order By: Relevance
“…Those kernels are functional but not optimized (e.g. buffer sizing, burst transfer), and could be improved as future work on heterogeneous optimization [35,40].…”
Section: Glue Code Generationmentioning
confidence: 99%
See 1 more Smart Citation
“…Those kernels are functional but not optimized (e.g. buffer sizing, burst transfer), and could be improved as future work on heterogeneous optimization [35,40].…”
Section: Glue Code Generationmentioning
confidence: 99%
“…On the contrary, PREESM considers kernels as black boxes without assumptions on their inner C++ code. HeteroFlow [40] too takes Halide as an input, via the HeteroHalide [27] compiler. While HeteroFlow supports data transfer directives written by the designer, as well as buffer reuse, it does not embed a delay type system to perform buffer sizing.…”
Section: Hls Tools With Buffer Optimizationmentioning
confidence: 99%
“…Since (10) is nonlinear in T k,k−1 , it needs to be solved using an iterative Gauss-Newton method. Given an estimation of the relative transformation Tk,k−1 , an incremental update T (ξ) of the estimate can be parameterized with a twist coordinate ξ ∈ se(3).…”
Section: Sparse Image Alignmentmentioning
confidence: 99%
“…The IRC and FA hardware accelerators as well as the host code are developed using Xilinx SDSoC 2019.1 and HLS C/C++. We also utilize the state-of-the-art HeteroFlow [10] to develop IRC and FA designs. With HeteroFlow, we generate only the HLS C/C++ code for the accelerators, as it does not support SDSoC nor the Xilinx ZU9EG FPGA.…”
Section: A Experiments Setupmentioning
confidence: 99%
See 1 more Smart Citation