A graphics processing unit-accelerated meshless method for two-dimensional compressible flows

Zhang, Jia-Le; Chen, Hongquan; Cao, Cheng

doi:10.1080/19942060.2017.1317027

Cited by 4 publications

(9 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to optimize the GPU performance, the number of threads per block for each kernel should be carefully tuned. According to our recently reported work [33], 64 threads per block is a reasonable choice for the CUDA kernels. Thus the total number of thread blocks could be determined by ( 1) / gridDim nTotalThread blockDim blockDim    (17) where nTotalThread represents the total number of threads.…”

Section: Cuda Kernel Functionsmentioning

confidence: 99%

“…This pattern is adopted in the present work so that all the threads in a half wrap map/access the global memory simultaneously with respect to the center of a meshless cloud. In reality, this means consecutive thread access consecutive memory addresses [33,34]. The computed results including Mach number contours and pressure coefficients are depicted in Fig.…”

Section: Device Memory Managementmentioning

confidence: 99%

“…A careful and delicate management is needed to ease the pressure on this scarce resource. Proper reusing of non-conflicting local variables and tuning the number of threads in a block are helpful to reduce the register pressure and to achieve the optimal performance[33].…”

mentioning

confidence: 99%

See 2 more Smart Citations

A GPU-accelerated implicit meshless method for compressible flows

Zhang

Chen

et al. 2018

Journal of Computational Physics

Self Cite

View full text Add to dashboard Cite

This paper develops a recently proposed GPU based two-dimensional explicit meshless method (Ma et al., 2014) by devising and implementing an efficient parallel LU-SGS implicit algorithm to further improve the computational efficiency. The capability of the original 2D meshless code is extended to deal with 3D complex compressible flow problems. To resolve the inherent data dependency of the standard LU-SGS method, which causes thread-racing conditions destabilizing numerical computation, a generic rainbow coloring method is presented and applied to organize the computational points into different groups by painting neighboring points with different colors. The original LU-SGS method is modified and parallelized accordingly to perform calculations in a color-by-color manner. The CUDA Fortran programming model is employed to develop the key kernel functions to apply boundary conditions, calculate time steps, evaluate residuals as well as advance and update the solution in 3 the temporal space. A series of two-and three-dimensional test cases including compressible flows over single-and multi-element airfoils and a M6 wing are carried out to verify the developed code. The obtained solutions agree well with experimental data and other computational results reported in the literature. Detailed analysis on the performance of the developed code reveals that the developed CPU based implicit meshless method is at least four to eight times faster than its explicit counterpart. The computational efficiency of the implicit method could be further improved by ten to fifteen times on the GPU.

show abstract

Section: Cuda Kernel Functionsmentioning

confidence: 99%

Section: Device Memory Managementmentioning

confidence: 99%

See 1 more Smart Citation

A GPU-accelerated implicit meshless method for compressible flows

Zhang

Chen

et al. 2018

Journal of Computational Physics

Self Cite

View full text Add to dashboard Cite

show abstract

“…In general, recent developments in meshless community are vivid, ranging from analyses of computer execution on different platforms [6,12], reducing computational cost by introducing a piecewise approximation [13] to implementation of more complex multi-phase flow [14], and many more. This paper extends the spectra of published papers with a generalized formulation of a local strong form meshless method, termed Meshless Local Strong Form Method (MLSM) enriched with h-refinement [15] and ability to discretize arbitrary domains [7].…”

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Refined Meshless Local Strong Form solution of Cauchy–Navier equation on an irregular domain

Slak

Kosec

2019

Engineering Analysis with Boundary Elements

View full text Add to dashboard Cite

This paper considers a numerical solution of a linear elasticity problem, namely the Cauchy-Navier equation, using a strong form method based on a local Weighted Least Squares (WLS) approximation. The main advantage of the employed numerical approach, also referred to as a Meshless Local Strong Form method, is its generality in terms of approximation setup and positions of computational nodes. In this paper, flexibility regarding the nodal position is demonstrated through two numerical examples, i.e. a drilled cantilever beam, where an irregular domain is treated with a relatively simple nodal positioning algorithm, and a Hertzian contact problem, where again, a relatively simple h-refinement algorithm is used to extensively refine discretization under the contact area. The results are presented in terms of accuracy and convergence rates, using different approximations and refinement setups, namely Gaussian and monomial based approximations, and a comparison of execution time for each block of the solution procedure.

show abstract

Parallel Accelerated Fifth-Order WENO Scheme-Based Pipeline Transient Flow Solution Model

2022

Applied Sciences

View full text Add to dashboard Cite

The water hammer phenomenon is the main problem in long-distance pipeline networks. The MOC (Method of characteristics) and finite difference methods lead to severe constraints on the mesh and Courant number, while the finite volume method of the second-order Godunov scheme has limited intermittent capture capability. These methods will produce severe numerical dissipation, affecting the computational efficiency at low Courant numbers. Based on the lax-Friedrichs flux splitting method, combined with the upstream and downstream virtual grid boundary conditions, this paper uses the high-precision fifth-order WENO scheme to reconstruct the interface flux and establishes a finite volume numerical model for solving the transient flow in the pipeline. The model adopts the GPU parallel acceleration technology to improve the program’s computational efficiency. The results show that the model maintains the excellent performance of intermittent excitation capture without spurious oscillations even at a low Courant number. Simultaneously, the model has a high degree of flexibility in meshing due to the high insensitivity to the Courant number. The number of grids in the model can be significantly reduced and higher computational efficiency can be obtained compared with MOC and the second-order Godunov scheme. Furthermore, this paper analyzes the acceleration effect in different grids. Accordingly, the acceleration effect of the GPU technique increases significantly with the increase in the number of computational grids. This model can support efficient and accurate fast simulation and prediction of non-constant transient processes in long-distance water pipeline systems.

show abstract

A graphics processing unit-accelerated meshless method for two-dimensional compressible flows

Cited by 4 publications

References 31 publications

A GPU-accelerated implicit meshless method for compressible flows

A GPU-accelerated implicit meshless method for compressible flows

Refined Meshless Local Strong Form solution of Cauchy–Navier equation on an irregular domain

Parallel Accelerated Fifth-Order WENO Scheme-Based Pipeline Transient Flow Solution Model

Contact Info

Product

Resources

About