Proceedings of the 2005 ACM/SIGDA 13th International Symposium on Field-Programmable Gate Arrays 2005
DOI: 10.1145/1046192.1046202
|View full text |Cite
|
Sign up to set email alerts
|

Sparse Matrix-Vector multiplication on FPGAs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
122
0

Year Published

2008
2008
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 180 publications
(125 citation statements)
references
References 10 publications
0
122
0
Order By: Relevance
“…Zhuo, et al, employ a tree-based multiply accumulate unit and a reduction unit to perform multiple operations in parallel [10]. However, the structure of the reduction unit depends on the sparsity pattern and there are a large number of zero paddings to meet the alignment requirement of the adder tree.…”
Section: Related Workmentioning
confidence: 99%
“…Zhuo, et al, employ a tree-based multiply accumulate unit and a reduction unit to perform multiple operations in parallel [10]. However, the structure of the reduction unit depends on the sparsity pattern and there are a large number of zero paddings to meet the alignment requirement of the adder tree.…”
Section: Related Workmentioning
confidence: 99%
“…In this paper, the sparse matrix is stored in a Compressed Sparse Row (CRS) format, in which only the nonzero matrix elements will be stored in contiguous memory locations. In CRS format, there are three vectors: val for nonzero matrix elements; col for the column index of the nonzero matrix elements; and ptr stores the locations in the val vector that start a new row (Zhuo and Prasanna, 2005). As an example, consider a simple SMVM operation with 5×5 sparse matrix A as follows: …”
Section: Sparse Matrix-vector Multiplicationmentioning
confidence: 99%
“…This algorithm is almost dominated by SMVM operations where the target matrix is extremely sparse, unsymmetrical and unstructured. This problem has also been investigated for acceleration with a FPGA solution in (McGettrick et al, 2008;Zhuo and Prasanna, 2005).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In [8], a block matrix multiplication algorithm is discussed for large n, and a floating-point MAC (Multiplier and ACcumulator) is implemented. In [6,22], FPGA-based designs for floatingpoint sparse matrix-vector multiplication are proposed and achieve high speedup over general-purpose processors. In [18], FPGA-based implementations of BLAS (Basic Linear Algebra Subprograms) operations are discussed.…”
Section: Linear Algebra On Fpgasmentioning
confidence: 99%