2008
DOI: 10.1109/tc.2008.55
|View full text |Cite
|
Sign up to set email alerts
|

High-Performance Designs for Linear Algebra Operations on Reconfigurable Hardware

Abstract: Abstract-Numerical linear algebra operations are key primitives in scientific computing. Performance optimizations of such operations have been extensively investigated. With the rapid advances in technology, hardware acceleration of linear algebra applications using field-programmable gate arrays (FPGAs) has become feasible. In this paper, we propose FPGA-based designs for several basic linear algebra operations, including dot product, matrix-vector multiplication, matrix multiplication, and matrix factorizat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
46
0

Year Published

2008
2008
2015
2015

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 81 publications
(46 citation statements)
references
References 26 publications
0
46
0
Order By: Relevance
“…[21] proposes a new architecture of bicubic interpolation and implemented it on FPGA. [22] proposes a linear algebra implementations on FPGA, the authors utilizes the overlap technique between I/O and execution time to increase computing speed. [23] shows the procedure of mapping a Jacobi Iterative Solver on FPGA.…”
Section: Related Workmentioning
confidence: 99%
“…[21] proposes a new architecture of bicubic interpolation and implemented it on FPGA. [22] proposes a linear algebra implementations on FPGA, the authors utilizes the overlap technique between I/O and execution time to increase computing speed. [23] shows the procedure of mapping a Jacobi Iterative Solver on FPGA.…”
Section: Related Workmentioning
confidence: 99%
“…Although the PE connection pattern in the form of a tree is also possible [13], the linear list has the advantage of a much more regular structure, which allows simpler routing between PEs and consequently the higher clock frequency. After the initial latency, a list of n PEs multiply two nelement vectors in one clock cycle, or two square matrices of order n in n …”
Section: Accelerator Architecturementioning
confidence: 99%
“…The main focus of that work was to examine the potential capacity of FPGAs in performing BLAS operations. The only work that has implemented linear algebra applications on the reconfigurable computing systems is [21]. However, it only employs the FPGAs in the systems.…”
Section: Linear Algebra On Fpgasmentioning
confidence: 99%