2020
DOI: 10.46586/tches.v2020.i2.49-72
|View full text |Cite
|
Sign up to set email alerts
|

Highly Efficient Architecture of NewHope-NIST on FPGA using Low-Complexity NTT/INTT

Abstract: NewHope-NIST is a promising ring learning with errors (RLWE)-based postquantum cryptography (PQC) for key encapsulation mechanisms. The performance on the field-programmable gate array (FPGA) affects the applicability of NewHope-NIST. In RLWE-based PQC algorithms, the number theoretic transform (NTT) is one of the most time-consuming operations. In this paper, low-complexity NTT and inverse NTT (INTT) are used to implement highly efficient NewHope-NIST on FPGA. First, both the pre-processing of NTT and the pos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
48
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 99 publications
(49 citation statements)
references
References 15 publications
0
48
0
1
Order By: Relevance
“…(MHz). For n = 1024, the proposed NTT architecture operates approximately 9.6×, 12×, 8×, and 2× faster than that of [9], [10], [12], and [15] respectively. Xing and Li [9] proposed a ping-pong NTT architecture that used four BUs and required a large number of CCs (i.e., 1280).…”
Section: Implementation Results and Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…(MHz). For n = 1024, the proposed NTT architecture operates approximately 9.6×, 12×, 8×, and 2× faster than that of [9], [10], [12], and [15] respectively. Xing and Li [9] proposed a ping-pong NTT architecture that used four BUs and required a large number of CCs (i.e., 1280).…”
Section: Implementation Results and Discussionmentioning
confidence: 99%
“…Xing and Li [9] proposed a ping-pong NTT architecture that used four BUs and required a large number of CCs (i.e., 1280). Zhang et al [10] used only two parallel BUs for the iterative computation, which utilized hardware resources effectively but consumed many CCs (i.e., 2569). Although Mert et al [15] significantly reduced the number of CCs (i.e., 200) by paralleling 32 processing elements, their NTT architecture operated at lower clock frequency and required more hardware resources.…”
Section: Implementation Results and Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…For RLWE, our results are given for q = 7681. Our CPAsecure RLWE-1024 implementation has an execution time very close to [34] (63µs vs. 62µs), but is less optimized in term of hardware resource consumption (4 vs. 2 DSPs). This shows some potential for HLS implementation.…”
Section: Comparison With Other Workmentioning
confidence: 99%