2020
DOI: 10.1002/cpe.6036
|View full text |Cite
|
Sign up to set email alerts
|

Multi‐GPU performance optimization of a computational fluid dynamics code using OpenACC

Abstract: Summary This article investigates the multi‐GPU performance of a 3D buoyancy driven cavity solver using MPI and OpenACC directives on multiple platforms. The article shows that decomposing the total problem in different dimensions affects the strong scaling performance significantly for the GPU. Without proper performance optimizations, it is shown that 1D domain decomposition scales poorly on multiple GPUs due to the noncontiguous memory access. The performance using whatever decompositions can be benefited f… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 23 publications
0
5
0
Order By: Relevance
“…At present, there are researches on using OpenACC for application optimization on GPUs. [24][25][26] To verify the performance, portability and productivity of the proposed scheme on SW26010 heterogeneous many-core processor on GPU, we focus on the OpenACC acceleration of MD simulation of silicon crystal on GPUs. The main work of this article includes:…”
Section: Related Surveys and Our Contributionsmentioning
confidence: 99%
See 1 more Smart Citation
“…At present, there are researches on using OpenACC for application optimization on GPUs. [24][25][26] To verify the performance, portability and productivity of the proposed scheme on SW26010 heterogeneous many-core processor on GPU, we focus on the OpenACC acceleration of MD simulation of silicon crystal on GPUs. The main work of this article includes:…”
Section: Related Surveys and Our Contributionsmentioning
confidence: 99%
“…OpenACC 23 is a directive‐based programming model for acceleration devices, which has been widely supported by the industry. At present, there are researches on using OpenACC for application optimization on GPUs 24‐26 . To verify the performance, portability and productivity of the proposed scheme on SW26010 heterogeneous many‐core processor on GPU, we focus on the OpenACC acceleration of MD simulation of silicon crystal on GPUs.…”
Section: Introductionmentioning
confidence: 99%
“…V1: Pack/Unpack. The goal of this optimization is to improve the memory throughput and reduce the communication cost if the required data are not located sequentially in memory [23]. As Fortran is a column-majored language, the first index i of a matrix A(i, j, k) denotes the fastest change.…”
Section: Gpu Optimization Using Openaccmentioning
confidence: 99%
“…Using ssspnt, different problems, platforms and different scalings can be compared more easily. Similar to Ref [23], every time in this paper is measured consecutively for at least three instances. The difference for each time point is smaller than 1% (usually less than 1 s out of more than 120 s).…”
Section: Performance Metricsmentioning
confidence: 99%
See 1 more Smart Citation