2009
DOI: 10.1007/978-3-642-02384-2_22
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Acceleration of Asymmetric Cryptography on Graphics Hardware

Abstract: Abstract. Graphics processing units (GPU) are increasingly being used for general purpose computing. We present implementations of large integer modular exponentiation, the core of public-key cryptosystems such as RSA, on a DirectX 10 compliant GPU. DirectX 10 compliant graphics processors are the latest generation of GPU architecture, which provide increased programming flexibility and support for integer operations. We present high performance modular exponentiation implementations based on integers represen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
28
1

Year Published

2010
2010
2018
2018

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 56 publications
(29 citation statements)
references
References 13 publications
0
28
1
Order By: Relevance
“…Exploiting much larger parallelism using the single instruction multiple threads paradigm, is realized by using a residue number system [14,29] as described in [4]. This approach is implemented for the massively parallel graphics processing units in [19]. An approach based on Montgomery multiplication which allows one to split the operand into two parts, which can be processed in parallel, is called bipartite modular multiplication and is introduced in [24].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Exploiting much larger parallelism using the single instruction multiple threads paradigm, is realized by using a residue number system [14,29] as described in [4]. This approach is implemented for the massively parallel graphics processing units in [19]. An approach based on Montgomery multiplication which allows one to split the operand into two parts, which can be processed in parallel, is called bipartite modular multiplication and is introduced in [24].…”
Section: Related Workmentioning
confidence: 99%
“…The research community has studied ways to reduce the latency of Montgomery multiplication by parallelizing this computation. These approaches vary from using the SIMD paradigm [8,10,18,23] to the single instruction, multiple threads paradigm using a residue number system [14,29] as described in [4,19] (see Sect. 2.3 for a more detailed overview).…”
Section: Introductionmentioning
confidence: 99%
“…However, the latter instruction is not exposed by CUDA API. To overcome this limitation, the authors of [10] propose to use slow 32-bit multiplication, while the tests from [11] show that 12-bit arithmetic is faster because modular reduction can be done in floating-point without overflow concerns.…”
Section: -Bit Modular Arithmetic On the Gpumentioning
confidence: 99%
“…As of now, the research is carried out to port the remaining algorithm stages (polynomial interpolation and the CRA) to the GPU. Modular computations still constitute a big challenge on the GPU, see [10,11]. Our algorithm uses the fast modular arithmetic developed in [1] which is based on mixing floatingpoint and integer computations, and is supported by the modified CUDA [12] compiler 1 .…”
mentioning
confidence: 99%
“…Cryptologic applications of GPUs have been considered before: symmetric cryptography in [33,20,56,21,44,11,18], asymmetric cryptography in [39,54,22] for RSA and in [54,1,9] for ECC, and enhancing symmetric [8] and asymmetric [7,5,6,10] cryptanalysis.…”
Section: Introductionmentioning
confidence: 99%