2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS) 2016
DOI: 10.1109/naecon.2016.7856841
|View full text |Cite
|
Sign up to set email alerts
|

QR decomposition using FPGAs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 3 publications
0
4
0
Order By: Relevance
“…A new loop structure of MGS algorithm was proposed by Langhammer and Pasca 27 and implemented on Intel Arria 10, the sustained‐to‐peak performance achieved was approximately 100%. It is expected that smaller size QRD would have higher speed; however, in Langhammer and Pasca, 27 bigger size QRD of 32 × 32 with 62.6 GFlops showed higher throughput than QRD in 37 which is 32 × 8 in size with 13.9 GFlops. This means algorithm structure affects synthesized hardware's performance.…”
Section: Previous Workmentioning
confidence: 99%
“…A new loop structure of MGS algorithm was proposed by Langhammer and Pasca 27 and implemented on Intel Arria 10, the sustained‐to‐peak performance achieved was approximately 100%. It is expected that smaller size QRD would have higher speed; however, in Langhammer and Pasca, 27 bigger size QRD of 32 × 32 with 62.6 GFlops showed higher throughput than QRD in 37 which is 32 × 8 in size with 13.9 GFlops. This means algorithm structure affects synthesized hardware's performance.…”
Section: Previous Workmentioning
confidence: 99%
“…A matrix A can be decomposed into the product of a Q orthogonal matrix and an R upper triangular matrix, where: A = QR, Fig. 2 illustrates the part of Q matrix computation from GS algorithm [10].…”
Section: Gram Schmidt Algorithmmentioning
confidence: 99%
“…LU decomposition is also likely unsuitable for small matrices, and most works restrict their solution to nonsingular matrices to avoid costly pivoting. In this thesis, we adapted the work from Parker, Mauer and Pritsker (2016) to include a heterogeneous solution with OpenCL working in double precision. We published those results in "Exploration of FPGA-Based Hardware Designs for QR Decomposition for Solving Stiff ODE Numerical Methods Using the HARP Hybrid Architecture" (JUNIOR et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Iterative Direct Double Memory type Codesign Jacobi (Souza, 2017) x x Local/Global x QR (Souza, 2017) x x Local/global x LU (Kapre 2009) x x Local LU (Daga, 2004) x x Local LU (Zhuo, 2006) x x Local LU (Wu, 2011) x Local/Global x Jacobi (Ruan, 2013) x Local/Global x QR (Parker, 2016) x Local QR (Langhammer, 2018) x LU (Ge, 2017) x Local/Global x Cholesky (Liu, 2017) x x Local/Global Gauss-Jordan (Jiang, 2017) x Gauss-Jordan (Meng, 2022) x Local/Global Truncated Spike (Macintosh, 2019) x Local/Global…”
Section: Related Workmentioning
confidence: 99%