1994
DOI: 10.1007/3-540-58184-7_111
|View full text |Cite
|
Sign up to set email alerts
|

Run-time optimization of sparse matrix-vector multiplication on SIMD machines

Abstract: Abstract. Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientific computations (e.g., finite element methods). In such solvers, the matrix-vector product is computed repeatedly, often thousands of times, with updated values of the vector until convergence is achieved. In an SIMD architecture, each processor has to fetch the updated off-processor vector elements while computing its share of the product. In this paper, we report on run-time optimization of array … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2002
2002
2016
2016

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(15 citation statements)
references
References 15 publications
0
15
0
Order By: Relevance
“…All programs are written in C + MPI (Message Passing Interface) [21] codes. The sparse ratio is set to 0.1 for all test three-dimensional sparse arrays used as test samples.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…All programs are written in C + MPI (Message Passing Interface) [21] codes. The sparse ratio is set to 0.1 for all test three-dimensional sparse arrays used as test samples.…”
Section: Resultsmentioning
confidence: 99%
“…Ziantz et al [21] proposed a run-time technique that was applied to sparse arrays for array distributions and off-processor data fetching to reduce the communication and computation time. They used the block data distribution scheme with a bin-packing algorithm to distribute a global sparse array to processors.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Ziantz et al [32] proposed a run-time optimization technique that was applied to sparse arrays for array distributions and off-processor data fetching to reduce the communication and the computation time. In their technique, they used the Block partition method with a binpacking algorithm to distribute a global sparse array to processors.…”
Section: Related Workmentioning
confidence: 99%
“…In the data distribution phase, these local sparse arrays are distributed to processors. In the data compression phase, a local sparse array is compressed by data compression methods in order to obtain better performance for sparse array operations [7,15,16,18,21,23,26,30]. A data distribution scheme with this order is called the Send Followed Compress (SFC) scheme.…”
Section: Introductionmentioning
confidence: 99%