Single precision (SP) arithmetic can be greatly accelerated
as
compared to double precision (DP) arithmetic on graphics processing
units (GPUs). However, the use of SP in the whole process of electronic
structure calculations is inappropriate for the required accuracy.
We propose a 3-fold dynamic precision approach for accelerated calculations
but still with the accuracy of DP. Here, SP, DP, and mixed precision
are dynamically switched during an iterative diagonalization process.
We applied this approach to the locally optimal block preconditioned
conjugate gradient method to accelerate a large-scale eigenvalue solver
for the Kohn–Sham equation. We determined a proper threshold
for switching each precision scheme by examining the convergence pattern
on the eigenvalue solver only with the kinetic energy operator of
the Kohn–Sham Hamiltonian. As a result, we achieved up to 8.53×
and 6.60× speedups for band structure and self-consistent field
calculations, respectively, for test systems under various boundary
conditions on NVIDIA GPUs.