2018
DOI: 10.1109/tpds.2018.2810237
|View full text |Cite
|
Sign up to set email alerts
|

Sparse Geometries Handling in Lattice Boltzmann Method Implementation for Graphic Processors

Abstract: We describe a high-performance implementation of the lattice Boltzmann method (LBM) for sparse geometries on graphic processors. In our implementation we cover the whole geometry with a uniform mesh of small tiles and carry out calculations for each tile independently with proper data synchronization at the tile edges. For this method, we provide both a theoretical analysis of complexity and the results for real implementations involving two-dimensional (2D) and three-dimensional (3D) geometries. Based on the … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 28 publications
0
14
0
Order By: Relevance
“…Moreover, the workload bounds are set to be in the magnitude of a 2D/3D computational fluid dynamic application involving 10 7 cells per process. The lower bound is 52 FLOP per cells, whereas the upper bound is 1165 FLOP per cells [14]. The number of overloading PEs is at most 20% of the total number of PEs.…”
Section: B Finding the Optimal Load Balancing Intervalsmentioning
confidence: 98%
“…Moreover, the workload bounds are set to be in the magnitude of a 2D/3D computational fluid dynamic application involving 10 7 cells per process. The lower bound is 52 FLOP per cells, whereas the upper bound is 1165 FLOP per cells [14]. The number of overloading PEs is at most 20% of the total number of PEs.…”
Section: B Finding the Optimal Load Balancing Intervalsmentioning
confidence: 98%
“…Lattice-Boltzmann computational fluid dynamic problem with 10 9 D2Q9 cells per processing unit with a performance of 1 Gflops [26]. The number of processing units is equal to the number of cores available in the supercomputer "Sunway…”
Section: Synthetic Benchmarksmentioning
confidence: 99%
“…Interactions between nodes are entirely linear, while the method's non-linearity enters a local collision process within each node [12]. This property makes the LBM very amenable to high-performance computing on parallel architectures, including GPUs [13,14]. The key advantage of LBM from the viewpoint of microfluidics is the adequacy for simulation of mesoscopic physics and multi-scale effects, which are hard to describe macroscopically.…”
Section: Introductionmentioning
confidence: 99%