Purpose
The purpose of this paper is to improve the computational speed of solving nonlinear dynamics by using parallel methods and mixed-precision algorithm on graphic processing units (GPUs). The computational efficiency of traditional central processing units (CPUs)-based computer aided engineering software has been difficult to satisfy the needs of scientific research and practical engineering, especially for nonlinear dynamic problems. Besides, when calculations are performed on GPUs, double-precision operations are slower than single-precision operations. So this paper implemented mixed precision for nonlinear dynamic problem simulation using Belytschko-Tsay (BT) shell element on GPU.
Design/methodology/approach
To minimize data transfer between heterogeneous architectures, the parallel computation of the fully explicit finite element (FE) calculation is realized using a vectorized thread-level parallelism algorithm. An asynchronous data transmission strategy and a novel dependency relationship link-based method, for efficiently solving parallel explicit shell element equations, are used to improve the GPU utilization ratio. Finally, this paper implements mixed precision for nonlinear dynamic problems simulation using the BT shell element on a GPU and compare it to the CPU-based serially executed program and a GPU-based double-precision parallel computing program.
Findings
For a car body model containing approximately 5.3 million degrees of freedom, the computational speed is improved 25 times over CPU sequential computation, and approximately 10% over double-precision parallel computing method. The accuracy error of the mixed-precision computation is small and can satisfy the requirements of practical engineering problems.
Originality/value
This paper realized a novel FE parallel computing procedure for nonlinear dynamic problems using mixed-precision algorithm on CPU-GPU platform. Compared with the CPU serial program, the program implemented in this article obtains a 25 times acceleration ratio when calculating the model of 883,168 elements, which greatly improves the calculation speed for solving nonlinear dynamic problems.