2011
DOI: 10.1016/j.parco.2011.08.004
|View full text |Cite
|
Sign up to set email alerts
|

Two-dimensional cache-oblivious sparse matrix–vector multiplication

Abstract: a b s t r a c tIn earlier work, we presented a one-dimensional cache-oblivious sparse matrix-vector (SpMV) multiplication scheme which has its roots in one-dimensional sparse matrix partitioning. Partitioning is often used in distributed-memory parallel computing for the SpMV multiplication, an important kernel in many applications. A logical extension is to move towards using a two-dimensional partitioning. In this paper, we present our research in this direction, extending the one-dimensional method for cach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 31 publications
(23 citation statements)
references
References 17 publications
0
23
0
Order By: Relevance
“…Applying partitioning to minimise communication between computing cores is not enough, as data access patterns of the input vector are not improved while bandwidth becomes more limited as more cores are involved in the computation. Future work should be directed towards combining communication minimisation with methods to enhance cache use, for example by permuting of the local input matrix representations , by adapting the sparse matrix storage scheme or both.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Applying partitioning to minimise communication between computing cores is not enough, as data access patterns of the input vector are not improved while bandwidth becomes more limited as more cores are involved in the computation. Future work should be directed towards combining communication minimisation with methods to enhance cache use, for example by permuting of the local input matrix representations , by adapting the sparse matrix storage scheme or both.…”
Section: Discussionmentioning
confidence: 99%
“…Because these factors are not constant, the BSP model does not make very accurate predictions on the run-more limited as more cores are involved in the computation. Future work should be directed towards combining communication minimisation with methods to enhance cache use, for example by permuting of the local input matrix representations [18,21], by adapting the sparse matrix storage scheme [27][28][29] or both. 27.…”
mentioning
confidence: 99%
“…1 for a visual example of it. However, unlike CSB (Buluç et al [8]) the sparse blocks dimensions are not uniform, and unlike Yzelman and Bisseling's ( [9]) our techniques are not hyper-graph based. Similarly to other approaches, selection of a data structure for blocks occurs, but without using completely novel formats, as Kourtis et al [10] do with CSX or as Belgin et al [11] do with PBR.…”
Section: Introduction and Related Literaturementioning
confidence: 94%
“…In recent years, the compressed sparse row (CSR) technology is very popular in finite element analysis . It can significantly reduce the memory requirements through only storing the non‐zeros of stiffness matrix.…”
Section: Multilayer and Multigrain Parallel Computing Approachmentioning
confidence: 99%
“…In recent years, the compressed sparse row (CSR) technology is very popular in finite element analysis. 31 It can significantly reduce the memory requirements through only storing the non-zeros of stiffness matrix. If we can use the CSR format to store structure stiffness matrix instead of the Skyline format, then the memory requirements will be considerably reduced.…”
Section: Solution To Limited Storage Of Mic Cardmentioning
confidence: 99%