Approximate Computing is a paradigm used by researchers as alternative to the diminishing of the evolution of hardware performance in the ongoing race for computational throughput in HPC. Precision reduction and mixed precision are the most studied among the existing techniques. In addition, some NVIDIA GPUs have Tensor Core architecture to speed up some classes of algorithms, such as matrix multiplication. This study aims to apply Approximate Computing techniques, like mixed precision, in matrix multiplication and stencil algorithms using OpenACC directives and cuTensor library to analyze performance gains versus accuracy losses. Results showed that it was possible to obtain a speedup of 16.60× with an optimized matrix multiplication algorithm present in the matmul intrinsic function using 16-bit floating-point data with Tensor Core, compared to a naive version using 64-bit floating-point. For this same case, accuracy loss went from 10−26 up to 10−1, approximately. For the stencil algorithm, it was possible to obtain a gain of 1.60× by only reducing variables precision from 64-bit to 16-bit floating-point, with accuracy loss from 0 to 10−9, for 300 iterations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.