2020
DOI: 10.1109/access.2020.2988379
|View full text |Cite
|
Sign up to set email alerts
|

Area-Time Efficient Hardware Implementation of Modular Multiplication for Elliptic Curve Cryptography

Abstract: In this paper, an area-time efficient hardware implementation of modular multiplication over five National Institute of Standard and Technology (NIST)-recommended prime fields is proposed for lightweight elliptic curve cryptography (ECC). A modified radix-2 interleaved algorithm is proposed to reduce the time complexity of conventional interleaved modular multiplication. The proposed multiplication algorithm is designed in hardware and separately implemented on Xilinx Virtex-7, Virtex-6, Virtex-5, and Virtex-4… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
56
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 48 publications
(57 citation statements)
references
References 27 publications
0
56
0
1
Order By: Relevance
“…Comparing with [23], our design requires 2.0× AT1 but offers 5.6× TR/A1. As compared with [30], although the AT2 is increased by 1.3×, our design improves the TR/A2 by 3.7×. Comparing with [27], our design improves the TR/A2 by 9.9× and 7.8×, respectively, with a similar AT2 performance.…”
Section: Resultsmentioning
confidence: 76%
See 2 more Smart Citations
“…Comparing with [23], our design requires 2.0× AT1 but offers 5.6× TR/A1. As compared with [30], although the AT2 is increased by 1.3×, our design improves the TR/A2 by 3.7×. Comparing with [27], our design improves the TR/A2 by 9.9× and 7.8×, respectively, with a similar AT2 performance.…”
Section: Resultsmentioning
confidence: 76%
“…Table III compares the performance of this design and various 256-bit interleaved modular multiplication implementations based on FPGA. To speed up the modular multiplication, both [23] and [30] use the radix-2 interleaved modular multiplication algorithm which consumes 257 clock cycles. However, thanks to the proposed ultra-high radix method, this design reduces the number of iterations from 256 to 11, leading to the computation latency reduced by 66.7% and 61.4% compared with [23] and [30] respectively.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The speed and occupied area of the processor entirely depend on it. Although a radix-2 multiplier consumes less hardware resources compared to higher radix (e.g., radix-4 and radix-8) multipliers [ 33 ], it is not compatible for high-speed multiplication because of its high latency. To reduce the latency, an efficient radix-4 interleaved modular multiplication algorithm is proposed as demonstrated in Algorithm 1.…”
Section: Proposed Hardware Architecturesmentioning
confidence: 99%
“…Analysis of Table 9 shows that U 4 -based FSMs are the ones with the highest maximum operating frequency compared to other methods. The overall design quality can be estimated by the product of used resources [63] (for example, chip area occupied by a circuit) and the latency time. As it is in [63], we use the number of LUTs to compare areas required for FSM circuits based on different models (auto, one-hot, JEDI, U 2 and U 4 ).…”
Section: Benchmarkmentioning
confidence: 99%