2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2019
DOI: 10.1109/ipdpsw.2019.00072
|View full text |Cite
|
Sign up to set email alerts
|

OpenMP to FPGA Offloading Prototype Using OpenCL SDK

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(7 citation statements)
references
References 6 publications
0
7
0
Order By: Relevance
“…ExaHyPE, an Exascale Hyperbolic PDE design [30] used a pragma-based GPU parallelization approach for object-oriented code, and documented lessons learned. Several other related works include demonstrating GPU support for OpenMP offloading features in compilers in Flang/Clang [3,25], a proof-ofconcept implementation of offloading for FPGA based accelerators [14,26], and an interprocedural statical analysis heuristic at runtime to select optimal grid sizes for offloaded target team constructs [27], among others. There are publicly available benchmark suites to evaluate heterogeneous application performance, e.g.…”
Section: Related Workmentioning
confidence: 99%
“…ExaHyPE, an Exascale Hyperbolic PDE design [30] used a pragma-based GPU parallelization approach for object-oriented code, and documented lessons learned. Several other related works include demonstrating GPU support for OpenMP offloading features in compilers in Flang/Clang [3,25], a proof-ofconcept implementation of offloading for FPGA based accelerators [14,26], and an interprocedural statical analysis heuristic at runtime to select optimal grid sizes for offloaded target team constructs [27], among others. There are publicly available benchmark suites to evaluate heterogeneous application performance, e.g.…”
Section: Related Workmentioning
confidence: 99%
“…Knaust et al [29] use Clang [30] to outline omp target regions at the level of the LLVM IR, and feed them into Intel's OpenCL HLS tool-chain to generate a hardware kernel for the FPGA. Their approach uses Intel's OpenCL API to allow the communication between host and FPGA.…”
Section: Resource Utilizationmentioning
confidence: 99%
“…This typically leads to very high compile times and very low FPGA occupation and performance, since CPU-and GPU-optimized code is notably inefficient in the FPGA architectures. Further work by Knaust [13] and Huthmann [14] attack this problem in different ways. The first one opts to prototype the FPGA device with OpenCL and compiler-specific interfaces, requiring IR (Intermediate Representation) backporting to make use of the HLS system and OpenCL interfaces.…”
Section: Related Workmentioning
confidence: 99%
“…More flexible than [13,14] is the aforementioned Yviquel et al [10] work. It does not generate target binary code but rather a Scala implementation (as a Java runtime binary) to be ran on any Apache Spark cluster.…”
Section: Related Workmentioning
confidence: 99%