Since the introduction of the Supersingular isogeny Diffie-Hellman (SIDH) key exchange protocol by Jao and de Feo in 2011, it and its variation (SIKE) have gained significant attention as a promising candidate for post-quantum cryptography (PQC). Until now, even though several implementations of the state-of-the-art SIKE mechanism were presented on CPUs and embedded MCUs, there was no consideration of implementing SIKE on parallel graphic processing units (GPUs). With the advent of the IoT era, a number of IT devices will communicate with application servers. Thus, developing efficient instance of SIKE on server sides is also important. GPUs have been considered as a promising candidate for a cryptographic accelerator. In this paper, we present an efficient implementation of Supersingular Isogeny Key Encapsulation (SIKE) mechanism on GPUs. Even though SIKE has fascinating advantages of much smaller key and ciphertext sizes compared with other NIST PQC candidates, its computational overhead is extremely high. Until now, a large amount of research has been conducted for enhancing the performance of SIKE with respect to software on typical CPU and embedded MCUs and hardware optimization on ASIC and FPGA. However, generic software optimization utilizing GPUs has not been considered yet. We target the GPU implementation of SIKEp503 security parameters which provides the security level 2 (At least as difficult to break as SHA256). For efficiency, we optimize the underlying field arithmetic, especially field multiplication and reduction over p503 = 2 250 3 159 − 1 and take full advantage of the properties of GPU architecture including memory hierarchy. The proposed GPU software based on RTX2080Ti provides around 36376.61 KeyGens/s, 25603.72 Encaps/s, and 22211.61 Decaps/s. These are about 140.64, 157.66, and 146.81 times of improvements to the SIKE CPU Software on Intel i9-10900K CPU, respectively. As far as we know, this is the first efficient implementation of SIKE software on GPU side.INDEX TERMS Graphic processing units (GPU), supersingular isogeny, SIKE, efficient implementation, post quantum cryptography, parallel processing, cloud computing.
I. INTRODUCTIONIt has been widely believed that the security of currently used public-Key Cryptosystems (PKC) such as RSA, ECC, etc. will be broken as the development of quantum computers which can execute Shor's algorithm. For preparing the security of information and communication systems in quantum computing era, there is huge demand for Post Quantum Cryptosystems (PQC) which are resistant against Shor's The associate editor coordinating the review of this manuscript and approving it for publication was Wei Huang . algorithm. In 2016, NIST has started the competition process for selecting secure and efficient PQC algorithms. In 2020, NIST announced 7 finalists and 8 alternatives in round 3. Among them, SIKE (Supersingular Isogeny Key Encapsulation) mechanism is the only PQC algorithm based on elliptic curves and has an advantage of much smaller key and ciphertext sizes com...