2010 IEEE 8th Symposium on Application Specific Processors (SASP) 2010
DOI: 10.1109/sasp.2010.5521144
|View full text |Cite
|
Sign up to set email alerts
|

FPGA and GPU implementation of large scale SpMV

Abstract: Sparse matrix-vector multiplication (SpMV) is a fundamental operation for many applications. Many studies have been done to implement the SpMV on different platforms, while few work focused on the very large scale datasets with millions of dimensions. This paper addresses the challenges of implementing large scale SpMV with FPGA and GPU in the application of web link graph analysis. In the FPGA implementation, we designed the task partition and memory hierarchy according to the analysis of datasets scale and t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(11 citation statements)
references
References 11 publications
(13 reference statements)
0
11
0
Order By: Relevance
“…If this is not the case zero padding is typically used to adapt the row size. Other approaches are based on statically assigning partial dot-products to multiple processing engines [23], [24]. A control unit is used to manage the communication, and ensure proper execution.…”
Section: B Spmv On Fpgamentioning
confidence: 99%
“…If this is not the case zero padding is typically used to adapt the row size. Other approaches are based on statically assigning partial dot-products to multiple processing engines [23], [24]. A control unit is used to manage the communication, and ensure proper execution.…”
Section: B Spmv On Fpgamentioning
confidence: 99%
“…A substantial body of literature has explored the optimization of sparse formats and algorithms for CPUs [6,7,1] and GPGPUs [8,9,10,11,12]. In general, these optimizations aim to minimize the irregularity of the matrix structure by selecting a format best suited for the matrix kernel.…”
Section: A Conventional Sparse Data Formatsmentioning
confidence: 99%
“…Depending on the implementation, the meta-data for CSR is either pre-loaded into the bitstream or dynamically accessed from external memory. While earlier designs were restricted to on-die memory capacities (e.g., [18]), more recent designs incorporate memory hierarchies that can handle large data sets exceeding the available onchip memories [24,25,26,11,10,27,9,28,29,30,14,23].…”
Section: Related Workmentioning
confidence: 99%
“…Similar activities exist on CPUs in a more generic setting [11]. Despite this, a substantial amount of work on sparse linear algebra is focused on saturating available memory bandwidth by increasing internal parallelism of an SpMV kernel, either in general purpose [4], [3], [9] or application specific [1], [2], [6] setting.…”
Section: Introductionmentioning
confidence: 99%