CFD code adaptation to the FPGA architecture

Rojek, Krzysztof; Halbiniak, Kamil; Kuczynski, Lukasz

doi:10.1177/1094342020972461

Cited by 7 publications

(4 citation statements)

References 24 publications

(36 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…So, the high-bandwidth data transmission would become extremely beneficial to augment the performance of these kernel. So, the 2.5D blocking was developed: for data migrating between each layer of the BRAM, there is only one layer of data downloaded into the BRAM block from the global memory [12]. So, the overall demand on the memory bandwidth is decreased.…”

Section: Cfd Acceleration With Fpgamentioning

confidence: 99%

Specialized fluid estimation implementation on the FPGA

Yang

2023

ACE

View full text Add to dashboard Cite

This paper studies the possibility of exploring Field Programmable Gate Array (FPGA) in the acceleration of Computational Fluid Dynamics (CFD). CFD is an industrial analysis tool to estimate the flow of matter. In the previous experience, CFD are most implemented on the conventional CPU, and may accelerate with GPUs in a high-performance computing center. This paper studies the architecture of the FPGA and compared the FPGA to CPU and the application and algorithm of CFD. We studied the previous works from different research and found that CFD occupies a huge advantage on the efficiency of the overall system. FPGA utilize less hardware resources, and usually presents a less computing time and higher throughput of data. So, FPGA is a viable and cost-effective solution for future alternative of the CFD computation with the same cost of constructing a high-performance computing center.

show abstract

Section: Cfd Acceleration With Fpgamentioning

confidence: 99%

Specialized fluid estimation implementation on the FPGA

Yang

2023

ACE

View full text Add to dashboard Cite

show abstract

“…In [17], a software package that implements the Conjugate Gradient algorithm for the Xilinx U280 Alveo accelerator card is presented. In [18], the Xilinx Alveo U250 FPGA architecture was investigated as the infrastructure capable of developing complex numerical simulations. The FPGA global memory bandwidth was saturated, and the kernel version was optimized using the OpenCL standard.…”

Section: Related Work a New Applications In The Smart Gridmentioning

confidence: 99%

“…Hardware Description [15] Conjugate Gradient for Lattice Quantum Chromodynamics U250 Performance comparison to the standard CPU architectures (BRAM/DSP/F-F/LUT/URAM) [17] Conjugate Gradient for Lattice Quantum Chromodynamics U280 A software package which implements the CG algorithm for FPGA-based accelerator cards which can serve as a backbone for many applications which are expected to gain a significant boost factor on FPGA accelerators. [18] Computational Fluids Dynamic U250 Performance and energy efficiency of FPGA cards for Computational Fluids Dynamic codes. Performance achieved for all the CFD kernels using a throughput expressed in GB/s and power dissipation using performance per watt metric.…”

Section: Reference Hpc Applicationmentioning

confidence: 99%

High-Performance Computing Architecture for Sample Value Processing in the Smart Grid

et al. 2022

View full text Add to dashboard Cite

“…Particularly, the last statement relates to stencil-based codes -a class of memory-bound applications that are quite common among scientific applications [18]. The stencil computations have traditionally been optimized by many authors over the years, especially considering various hardware platforms such as multicore CPUs [19], short-vector SIMD architectures [20], Intel Xeon Phi [21], GPU [22] and FPGA [23] accelerators. One of the main directions of improving the efficiency of stencil computations is focused around different methods for the domain decomposition [24], including overlapping neighbor domains [25].…”

Section: Related Workmentioning

confidence: 99%

Architectural Adaptation and Performance-Energy Optimization for CFD Application on AMD EPYC Rome

Szustak

Wyrzykowski

Kuczynski

et al. 2021

IEEE Trans. Parallel Distrib. Syst.

Self Cite

View full text Add to dashboard Cite

The advantages of the second-generation AMD EPYC Rome processors can be successfully used in the race to Exascale. However, the novel architecture's complexity makes it challenging to adapt demanding scientific codes -like stencil ones -to platforms with Rome CPUs. This paper tackles this challenge by exploring the adaptation of the stencil-based CFD (computational fluid dynamics) application called MPDATA to these processors' influential features. We show that the previously proposed parametric adaptation methodology can be profitably applied to extend the performance portability of the memory-bound MPDATA on the AMD EPYC architecture. The extension of the parametric adaptation on the novel architecture requires careful consideration of two relevant aspects that reflect splitting the Rome architecture into multiple dies -features of the cache hierarchy and partitioning cores into work teams. The paper also investigates the correlation between the performance optimizations and energy efficiency for a ccNUMA platform powered by top-of-the-line 64-core AMD Rome 7742 CPUs, comparing the results against two servers with Intel Xeon Scalable processors of different generations. Even without appealing to prices, the achieved performance and energy efficiency results are a solid argument confirming the competitiveness of AMD Rome processors against Intel Xeon CPUs in scientific applications.

show abstract

CFD code adaptation to the FPGA architecture

Cited by 7 publications

References 24 publications

Specialized fluid estimation implementation on the FPGA

Specialized fluid estimation implementation on the FPGA

High-Performance Computing Architecture for Sample Value Processing in the Smart Grid

Architectural Adaptation and Performance-Energy Optimization for CFD Application on AMD EPYC Rome

Contact Info

Product

Resources

About