2011
DOI: 10.1016/j.jocs.2011.01.008
|View full text |Cite
|
Sign up to set email alerts
|

A simulation suite for Lattice-Boltzmann based real-time CFD applications exploiting multi-level parallelism on modern multi- and many-core architectures

Abstract: We present a software approach to hardware-oriented numerics which builds upon an augmented, previously published set of open-source libraries facilitating portable code development and optimisation on a wide range of modern computer architectures. In order to maximise efficiency, we exploit all levels of parallelism, including vectorisation within CPU cores, the Cell BE and GPUs, shared memory thread-level parallelism between cores, and parallelism between heterogeneous distributed memory resources in cluster… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…As a final benchmark, we demonstrate the effectiveness of the full ICARUS Tegra K1 cluster with a sophisticated CFD solver based on the Lattice-Boltzmann method, optimised for GPU as well as CPU execution [8]. In Figure 4 we depict how energy and time to solution behave in a strong scaling test in single precision (note, that this time, a smaller value on the x-axis means higher performance).…”
Section: Hardware-and Energy Efficiency Scalabilitymentioning
confidence: 99%
“…As a final benchmark, we demonstrate the effectiveness of the full ICARUS Tegra K1 cluster with a sophisticated CFD solver based on the Lattice-Boltzmann method, optimised for GPU as well as CPU execution [8]. In Figure 4 we depict how energy and time to solution behave in a strong scaling test in single precision (note, that this time, a smaller value on the x-axis means higher performance).…”
Section: Hardware-and Energy Efficiency Scalabilitymentioning
confidence: 99%
“…The authors did not extend the work to 3D domain where the topology of indices, the data transfer between devices, and boundary conditions are more complex and may drive the parallel implementation to an unsatisfied scalability. The other parallel LBM related works are aiming at specific research areas such as complex boundary problems, turbulent flow, and transport dissipative particle dynamics, but few multi‐GPU LBM implementations have been reported in shale formation simulation or digital rock physics. Hence, we decided to explore the potential of high‐performance computing applications in oil and gas field, especially in digital rock simulation, by implementing a multiscale 3D LBM program on large‐scale GPU cluster.…”
Section: Introductionmentioning
confidence: 99%
“…Early implementations benefited from graphical languages such as OpenGL (Li et al, 2003;Zhu et al 2008). By the advent of the modern programmable graphics cards and the new graphical programming language, CUDA, by nVIDIA, LBM-based computational fluid dynamics solvers have been extensively ported to graphics processors in order to facilitate the previously expensive, timeconsuming flow simulations (Tölke and Krafczyk, 2008;Geveler et al, 2011). Fortunately, the new multi-component entropic model (Arcidiacono et al, 2006a;2006b;2008) still inherits its explicit nature and massive parallelization is viable through the new heterogeneous, general purpose GPU architectures.…”
Section: Introductionmentioning
confidence: 99%