2009
DOI: 10.1007/978-3-642-03644-6_11
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Multiplication of Polynomials on Graphics Hardware

Abstract: Abstract. We present the algorithm to multiply univariate polynomials with integer coefficients efficiently using the Number Theoretic transform (NTT) on Graphics Processing Units (GPU). The same approach can be used to multiply large integers encoded as polynomials. Our algorithm exploits fused multiply-add capabilities of the graphics hardware. NTT multiplications are executed in parallel for a set of distinct primes followed by reconstruction using the Chinese Remainder theorem (CRT) on the GPU. Our benchma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
21
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 26 publications
(21 citation statements)
references
References 22 publications
0
21
0
Order By: Relevance
“…A more detailed characterization of FFT architectures can be found in [SSHA08]. The implementation of the NTT on graphic cards is described in [Eme09]. A cryptographic processor for performing elliptic curve cryptography (ECC) in the frequency domain is presented in [BKPS07].…”
Section: Previous Work Unrelated To Lattice-based Cryptographymentioning
confidence: 99%
“…A more detailed characterization of FFT architectures can be found in [SSHA08]. The implementation of the NTT on graphic cards is described in [Eme09]. A cryptographic processor for performing elliptic curve cryptography (ECC) in the frequency domain is presented in [BKPS07].…”
Section: Previous Work Unrelated To Lattice-based Cryptographymentioning
confidence: 99%
“…We use the arithmetic based on mixing floating-point and integer computations [1] which is supported by the patched CUDA compiler 8 . In what follows, we will refer to umul24 and umul24hi as intrinsics for mul24.lo and mul24.hi respectively.…”
Section: -Bit Modular Arithmetic On the Gpumentioning
confidence: 99%
“…In total, line 14 is compiled in 4 multiply-add (MAD) instructions 9 . The remaining part is an inlined reduce mod operation (see [1]) with a minor change. Namely, in line 15 we use the mantissa trick [18] to multiply by 1/m and round the result down using a single MAD instruction.…”
Section: -Bit Modular Arithmetic On the Gpumentioning
confidence: 99%
See 2 more Smart Citations