HEPCloud: An FPGA-based Multicore Processor for FV Somewhat Homomorphic Function Evaluation

Roy, Sujoy Sinha; Järvinen, Kimmo; Vliegen, Jo; Vercauteren, Fréderik; Verbauwhede, Ingrid

doi:10.1109/tc.2018.2816640

Cited by 45 publications

(17 citation statements)

References 30 publications

(34 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We remark that our best GPU results, namely the homomorphic multiplication runtime of 51 ms for n = 2 16 and log 2 q = 1, 770 and 18.7 ms for n = 2 16 and log 2 q = 1, 020, are more than two orders of magnitude faster than best previously reported runtimes for other implementations of the BFV scheme. For instance, the FPGAbased implementation HEPCloud in [18] of the textbook BFV scheme computed a homomorphic multiplication for n = 2 15 and log 2 q = 1, 228 in 26.67 seconds (with 3.36 seconds spent on the computation and the rest on the off-chip memory access). The BEHZ variant NFLlib CPU implementation in [9] ran a homomorphic multiplication for n = 2 15 and log 2 q = 1, 590 in 4.9 seconds.…”

Section: Benchmarkingmentioning

confidence: 99%

See 1 more Smart Citation

Implementation and Performance Evaluation of RNS Variants of the BFV Homomorphic Encryption Scheme

Badawi

Polyakov

Aung

et al. 2021

IEEE Trans. Emerg. Topics Comput.

View full text Add to dashboard Cite

Homomorphic encryption is an emerging form of encryption that provides the ability to compute on encrypted data without ever decrypting them. Potential applications include aggregating sensitive encrypted data on a cloud environment and computing on the data in the cloud without compromising data privacy. There have been several recent advances resulting in new homomorphic encryption schemes and optimized variants. We implement and evaluate the performance of two optimized variants, namely Bajard-Eynard-Hasan-Zucca (BEHZ) and Halevi-Polyakov-Shoup (HPS), of the most promising homomorphic encryption scheme in CPU and GPU. The most interesting (and also unexpected) result of our performance evaluation is that the HPS variant in practice scales significantly better (typically by 15%-30%) with increase in multiplicative depth of the computation circuit than BEHZ, implying that the HPS variant will always outperform BEHZ for most practical applications. For the multiplicative depth of 98, our fastest GPU implementation performs homomorphic multiplication in 51 ms for 128-bit security settings, which is faster by two orders of magnitude than prior results and already practical for cloud environments supporting GPU computations. Large multiplicative depths supported by our implementations are required for applications involving deep neural networks, logistic regression learning, and other important machine learning problems.

show abstract

Section: Benchmarkingmentioning

confidence: 99%

“…For instance, Al Badawi et al [17] provide a GPU-accelerated implementation of BEHZ. Another recent effort dealt with accelerating the textbook BFV performance using FPGA [18].…”

Section: Introductionmentioning

confidence: 99%

Implementation and Performance Evaluation of RNS Variants of the BFV Homomorphic Encryption Scheme

Badawi

Polyakov

Aung

et al. 2021

IEEE Trans. Emerg. Topics Comput.

View full text Add to dashboard Cite

show abstract

“…As has been shown by prior art [53,54], leveraging off-chip memory to store intermediate results significantly reduces the overall performance due to high delays between subsequent reads and writes. One of our primary design goals is to avoid off-chip memory access as much as possible.…”

Section: On-chip Vs Off-chip Memory Accessesmentioning

confidence: 99%

“…Hardware Acceleration for non-CKKS Schemes. In [53], a system based on FPGA is proposed for BFV scheme to process ciphertext polynomial sizes of 2 15 . However, due to the massive off-chip data transfer, their design does not yield superior performance compared to CPU execution.…”

Section: Related Workmentioning

confidence: 99%

“…Prior work that propose customized hardware for non-CKKS schemes have taken one of these approaches: (i) Designing co-processors that only accelerate certain low-level ring operations [14,19,20,30,39,61]; high-level operations are performed on the CPU-side, which makes the coprocessors of limited practical use. (ii) Storing intermediate results on off-chip memory, which significantly degrades the performance [51] to the extent that it can be worse than naive software execution [53]. (iii) Designing a hardware for a fixed modest-sized parameter, e.g., n = 2 12 [54].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Heax

Riazi

Laine

Pelton

et al. 2020

Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Syste

158

View full text Add to dashboard Cite

With the rapid increase in cloud computing, concerns surrounding data privacy, security, and confidentiality also have been increased significantly. Not only cloud providers are susceptible to internal and external hacks, but also in some scenarios, data owners cannot outsource the computation due to privacy laws such as GDPR, HIPAA, or CCPA. Fully Homomorphic Encryption (FHE) is a groundbreaking invention in cryptography that, unlike traditional cryptosystems, enables computation on encrypted data without ever decrypting it. However, the most critical obstacle in deploying FHE at large-scale is the enormous computation overhead. In this paper, we present HEAX, a novel hardware architecture for FHE that achieves unprecedented performance improvements. HEAX leverages multiple levels of parallelism, ranging from ciphertext-level to fine-grained modular arithmetic level. Our first contribution is a new highlyparallelizable architecture for number-theoretic transform (NTT) which can be of independent interest as NTT is frequently used in many lattice-based cryptography systems. Building on top of NTT engine, we design a novel architecture for computation on homomorphically encrypted data. Our implementation on reconfigurable hardware demonstrates 164-268× performance improvement for a wide range of FHE parameters.

show abstract

Efficient number theoretic transform implementation on GPU for homomorphic encryption

et al. 2021

View full text Add to dashboard Cite

HEPCloud: An FPGA-based Multicore Processor for FV Somewhat Homomorphic Function Evaluation

Cited by 45 publications

References 30 publications

Implementation and Performance Evaluation of RNS Variants of the BFV Homomorphic Encryption Scheme

Implementation and Performance Evaluation of RNS Variants of the BFV Homomorphic Encryption Scheme

Heax

Efficient number theoretic transform implementation on GPU for homomorphic encryption

Contact Info

Product

Resources

About