In this paper, we present a GPU-based parallel algorithm for the Learning With Errors (LWE) problem using a lattice-based Bounded Distance Decoding (BDD) approach. To the best of our knowledge, this is the first GPU-based implementation for the LWE problem. Compared to the sequential BDD implementation of Lindner-Peikert and pruned-enumeration strategies by Kirshanova [1], our GPU-based implementation is almost faster by a factor 6 and 9 respectively. The used GPU is NVIDIA GeForce GTX 1060 6G. We also provided a parallel implementation using two GPUs. The results showed that our algorithm is scalable and faster than the sequential version (Lindner-Peikert and pruned-enumeration) by a factor of almost 13 and 16 respectively. Moreover, the results showed that our parallel implementation using two GPUs is more efficient than Kirshanova et al.'s parallel implementation using 20 CPU-cores. INDEX TERMS Learning with error, lattice-based cryptography, LLL algorithm, shortest vector problem, closest vector problem, bounded distance decoding, GPU, cryptanalysis. NOMENCLATURE , Inner product |S| Number of elements of a set S Euclidean norm Z p Set of nonnegative integers less than p O(f) Soft O notation v Row vector (usually in Z m) a mod q Remainder after division of a by q poly(n) Polynomial function of degree n span{b 1 ,. .. , b n } Span of a set of vectors over Z
Hard Lattice problems are assumed to be one of the most promising problems for generating cryptosystems that are secure in quantum computing. The shortest vector problem (SVP) is one of the most famous lattice problems. In this paper, we present three improvements on GPU-based parallel algorithms for solving SVP using the classical enumeration and pruned enumeration. There are two improvements for preprocessing: we use a combination of randomization and the Gaussian heuristic to expect a better basis that leads rapidly to a shortest vector and we expect the level on which the exchanging data between CPU and GPU is optimized. In the third improvement, we improve GPU-based implementation by generating some points in GPU rather than in CPU. We used NVIDIA GeForce GPUs of type GTX 1060 6G. We achieved a significant improvement upon Hermans’s improvement. The improvements speed up the pruned enumeration by a factor of almost 2.5 using a single GPU. Additionally, we provided an implementation for multi-GPUs by using two GPUs. The results showed that our algorithm of enumeration is scalable since the speedups achieved using two GPUs are almost faster than Hermans’s improvement by a factor of almost 5. The improvements also provided a high speedup for the classical enumeration. The speedup achieved using our improvements and two GPUs on a challenge of dimension 60 is almost faster by factor 2 than Correia’s parallel implementation using a dual-socket machine with 16 physical cores and simultaneous multithreading technology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.