2020
DOI: 10.1177/1094342020972461
|View full text |Cite
|
Sign up to set email alerts
|

CFD code adaptation to the FPGA architecture

Abstract: For the last years, we observe the intensive development of accelerated computing platforms. Although current trends indicate a well-established position of GPU devices in the HPC environment, FPGA (Field-Programmable Gate Array) aspires to be an alternative solution to offload the CPU computation. This paper presents a systematic adaptation of four various CFD (Computational Fluids Dynamic) kernels to the Xilinx Alveo U250 FPGA. The goal of this paper is to investigate the potential of the FPGA architecture a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 24 publications
(36 reference statements)
0
4
0
Order By: Relevance
“…So, the high-bandwidth data transmission would become extremely beneficial to augment the performance of these kernel. So, the 2.5D blocking was developed: for data migrating between each layer of the BRAM, there is only one layer of data downloaded into the BRAM block from the global memory [12]. So, the overall demand on the memory bandwidth is decreased.…”
Section: Cfd Acceleration With Fpgamentioning
confidence: 99%
“…So, the high-bandwidth data transmission would become extremely beneficial to augment the performance of these kernel. So, the 2.5D blocking was developed: for data migrating between each layer of the BRAM, there is only one layer of data downloaded into the BRAM block from the global memory [12]. So, the overall demand on the memory bandwidth is decreased.…”
Section: Cfd Acceleration With Fpgamentioning
confidence: 99%
“…In [17], a software package that implements the Conjugate Gradient algorithm for the Xilinx U280 Alveo accelerator card is presented. In [18], the Xilinx Alveo U250 FPGA architecture was investigated as the infrastructure capable of developing complex numerical simulations. The FPGA global memory bandwidth was saturated, and the kernel version was optimized using the OpenCL standard.…”
Section: Related Work a New Applications In The Smart Gridmentioning
confidence: 99%
“…Hardware Description [15] Conjugate Gradient for Lattice Quantum Chromodynamics U250 Performance comparison to the standard CPU architectures (BRAM/DSP/F-F/LUT/URAM) [17] Conjugate Gradient for Lattice Quantum Chromodynamics U280 A software package which implements the CG algorithm for FPGA-based accelerator cards which can serve as a backbone for many applications which are expected to gain a significant boost factor on FPGA accelerators. [18] Computational Fluids Dynamic U250 Performance and energy efficiency of FPGA cards for Computational Fluids Dynamic codes. Performance achieved for all the CFD kernels using a throughput expressed in GB/s and power dissipation using performance per watt metric.…”
Section: Reference Hpc Applicationmentioning
confidence: 99%
“…Particularly, the last statement relates to stencil-based codes -a class of memory-bound applications that are quite common among scientific applications [18]. The stencil computations have traditionally been optimized by many authors over the years, especially considering various hardware platforms such as multicore CPUs [19], short-vector SIMD architectures [20], Intel Xeon Phi [21], GPU [22] and FPGA [23] accelerators. One of the main directions of improving the efficiency of stencil computations is focused around different methods for the domain decomposition [24], including overlapping neighbor domains [25].…”
Section: Related Workmentioning
confidence: 99%