Speeding scalar multiplication over binary elliptic curves using the new carry-less multiplication instruction

Taverne, Jonathan; Faz-Hernández, Armando; Aranha, Diego F.; Rodríguez-Henríquez, Francisco; Hankerson, Darrel; López, Julio

doi:10.1007/s13389-011-0017-8

Cited by 38 publications

(42 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…All numbers are given in 10 3 cycles on a single core. Aranha et al [2] no Core i7-860 4-TNAF -386 -1656 Taverne et al [19] no Core i7 (SNB) 5-τ NAF,τ &add 068 -264 -Aranha et al [3] no Core i7 (SNB) 5-τ NAF,τ This work yields very good results for random curves and wins against all current state-of-the-art implementations over the NIST fields, even with the costly side channel countermeasures. Since all previous works have been implemented on the Sandy/Ivy Bridge, we compare with the Ivy Bridge implementation here.…”

Section: Comparison To Other Workmentioning

confidence: 82%

“…Since all previous works have been implemented on the Sandy/Ivy Bridge, we compare with the Ivy Bridge implementation here. For the random curve over the GF (2 233 ) NIST field, our implementation is about factor 1.22 (1.23 in GF (2 409 )) faster than [19] and even factor 3 (GF (2 283 )) and 3.5 (GF (2 571 )) faster than reported in [2] for a single core. Although our Ivy Bridge implementation for Koblitz curves beats the numbers presented in [2] by 1.77x and 1.51x, [19] achieve results which are factor 1.53 faster, whilst [3] is even as twice as fast.…”

Section: Comparison To Other Workmentioning

confidence: 84%

See 1 more Smart Citation

Fast software implementation of binary elliptic curve cryptography

Bluhm

Gueron

2015

J Cryptogr Eng

View full text Add to dashboard Cite

Abstract. This paper presents an efficient and side channel protected software implementation of point multiplication for the standard NIST and SECG binary elliptic curves. The enhanced performance is achieved by improving the Lòpez-Dahab/Montgomery method at the algorithmic level, and by leveraging Intel's AVX architecture and the pclmulqdq processor instruction at the coding level. The fast carry-less multiplication is further used to speed up the reduction on the newest Haswell platforms. For the five NIST curves over GF (2 m ) with m ∈ {163, 233, 283, 409, 571}, the resulting point multiplication implementation is about 6 to 12 times faster than that of OpenSSL-1.0.1e, enhancing the ECDHE and ECDSA algorithms significantly.

show abstract

Section: Comparison To Other Workmentioning

confidence: 82%

Section: Comparison To Other Workmentioning

confidence: 84%

Fast software implementation of binary elliptic curve cryptography

Bluhm

Gueron

2015

J Cryptogr Eng

View full text Add to dashboard Cite

show abstract

“…The table lookups operate on registers only, allowing a very efficient constant-time implementation. Field multiplication is natively supported by the carry-less multiplier (PCLMULQDQ instruction), with the number of word multiplications reduced through application of Karatsuba formulae, as described in [26]. Modular reduction is implemented with a shift-and-add approach, with careful choice of aligning vector word shifts on multiples of 8, to explore the faster memory alignment instructions available in the target platform.…”

Section: Implementation Aspectsmentioning

confidence: 99%

Binary Elligator Squared

Aranha

Fouque

Qian

et al. 2014

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. Applications of elliptic curve cryptography to anonymity, privacy and censorship circumvention call for methods to represent uniformly random points on elliptic curves as uniformly random bit strings, so that, for example, ECC network traffic can masquerade as random traffic.At ACM CCS 2013, Bernstein et al. proposed an efficient approach, called "Elligator," to solving this problem for arbitrary elliptic curve-based cryptographic protocols, based on the use of efficiently invertible maps to elliptic curves. Unfortunately, such invertible maps are only known to exist for certain classes of curves, excluding in particular curves of prime order and curves over binary fields. A variant of this approach, "Elligator Squared," was later proposed by Tibouchi (FC 2014) supporting not necessarily injective encodings to elliptic curves (and hence a much larger class of curves), but, although some rough efficiency estimates were provided, it was not clear how an actual implementation of that approach would perform in practice.In this paper, we show that Elligator Squared can indeed be implemented very efficiently with a suitable choice of curve encodings. More precisely, we consider the binary curve setting (which was not discussed in Tibouchi's paper), and implement the Elligator Squared bit string representation algorithm based on a suitably optimized version of the Shallue-van de Woestijne characteristic 2 encoding, which we show can be computed using only multiplications, trace and half-trace computations, and a few inversions.On the fast binary curve of Oliveira et al. (CHES 2013), our implementation runs in an average of only 22850 Haswell cycles, making uniform bit string representations possible for a very reasonable overhead-much smaller even than Elligator on Edwards curves.As a side contribution, we also compare implementations of Elligator and Elligator Squared on a curve supported by Elligator, namely Curve25519. We find that generating a random point and its uniform bitstring representation is around 35-40% faster with Elligator for protocols using a fixed base point (such as static ECDH), but 30-35% faster with Elligator Squared in the case of a variable base point (such as ElGamal encryption). Both are significantly slower than our binary curve implementation.

show abstract

“…In [4], a high dynamic range RNS bases for MM has been proposed. Elliptic curves represent a very elegant and efficient way to encrypt/decrypt information, where in MM is also the key operation [5,6]. There have been various proposals for hardware architectures for MM [2, 7 -10] and for ME [2,7,11], exploring, in both operations, parallel and systolic features.…”

Section: Related Workmentioning

confidence: 99%

Massively parallel modular exponentiation method and its implementation in software and hardware for high-performance cryptographic systems

Nedjah

Mourelle

Santana

et al. 2012

IET Comput. Digit. Tech.

View full text Add to dashboard Cite

Most cryptographic systems are based on modular exponentiation (ME). It is performed using successive modular multiplications (MMs). In this case, there are many ways to improve the throughput of a cryptographic system implementation: one is reducing the number of the required MMs and the other is reducing the time spent in performing a single MM and a third way consists of executing required independent modular multiplications (IMMs) in parallel. With the purpose of further accelerating the computation of ME, we investigate the impact of these three strategies. First, we propose a massively parallel scheme aiming at performing all IMMs concurrently. The scheme is based on the m-ary exponentiation method, which groups the exponent bits into partition so that the number of required MMs is reduced, provided that some common modular powers are pre-computed and stored for future repeated use. Finally, two different implementations for the MM are used: one is sequential and the other systolic. This investigation is culminated by a comparison of the speedups yielded against the extra-costs due for seven different implementations. One implementation is software based and the other six are hardware based.

show abstract

Speeding scalar multiplication over binary elliptic curves using the new carry-less multiplication instruction

Cited by 38 publications

References 33 publications

Fast software implementation of binary elliptic curve cryptography

Fast software implementation of binary elliptic curve cryptography

Binary Elligator Squared

Massively parallel modular exponentiation method and its implementation in software and hardware for high-performance cryptographic systems

Contact Info

Product

Resources

About