From Sparse Matrix to Optimal GPU CUDA Sparse Matrix Vector Product Implementation

Zein, Ahmed H. El; Rendell, Alistair P.

doi:10.1109/ccgrid.2010.81

Cited by 10 publications

(5 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our work produces results comparable to other modeling efforts of the SpMV on GPUs [6], [8], [9]. Our proposed model offers an alternative method in the prediction of SpMV execution time by using the number of memory accesses.…”

Section: Resultssupporting

confidence: 73%

See 1 more Smart Citation

A memory transaction model for Sparse Matrix-Vector multiplications on GPUs

Keklikian

Langlois

Savaria

2014

2014 IEEE 12th International New Circuits and Systems Conference (NEWCAS)

View full text Add to dashboard Cite

The Sparse Matrix-Vector multiplication (SpMV) is an algorithm used in many fields. Since the introduction of CUDA and general purpose programming on GPUs, several efforts to optimize it have been reported. SpMV optimization is complex due to irregular memory accesses depending on the nonzero element distribution of the matrix. In this paper, we propose a model that predicts the number of memory transactions of SpMV for a matrix stored in the CSR format. With the number of memory transactions known in advance, the performance and the execution time can be estimated. The model can be used to select the best suited CUDA implementation for sparse matrices for a given application domain. Predicted results from the model are within 7.5% for the matrices of more than 1000 rows that we have tested on the NVIDIA Tesla K20c and GeForce GTX 670.

show abstract

Section: Resultssupporting

confidence: 73%

“…El Zein and Rendell [8] attempted to identify the best CUDA implementation using the CSR format with an experimental approach. Their technique creates multiple kernel implementations, all using different combinations of memory locations for storing the matrix.…”

Section: Literature Reviewmentioning

confidence: 99%

A memory transaction model for Sparse Matrix-Vector multiplications on GPUs

Keklikian

Langlois

Savaria

2014

2014 IEEE 12th International New Circuits and Systems Conference (NEWCAS)

View full text Add to dashboard Cite

show abstract

“…Many libraries are available, such as [21,[29][30][31][32][33][34]. Some libraries are so specialized that they handle only a single aspect of linear algebra, as seen in [35][36][37][38][39][40][41]. Each of these libraries offers different implementations.…”

Section: Inner Corementioning

confidence: 99%

Hiperwalk: Simulation of Quantum Walks with Heterogeneous High-Performance Computing

Motta,

Bezerra,

Santos

et al. 2023

2023 IEEE International Conference on Quantum Computing and Engineering (QCE)

View full text Add to dashboard Cite

The Hiperwalk package is designed to facilitate the simulation of quantum walks using heterogeneous high-performance computing, taking advantage of the parallel processing power of diverse processors such as CPUs, GPUs, and acceleration cards. This package enables the simulation of both the continuous-time and discrete-time quantum walk models, effectively modeling the behavior of quantum systems on large graphs. Hiperwalk features a user-friendly Python package frontend with comprehensive documentation, as well as a high-performance C-based inner core that leverages parallel computing for efficient linear algebra calculations. This versatile tool empowers researchers to better understand quantum walk behavior, optimize implementation, and explore a wide range of potential applications, including spatial search algorithms.

show abstract

“…Zein and Rendell [26] has explored the effect of these different options on the performance of a routine that evaluated sparse matrix vector products. They have proposed a process for analysing performance and selecting the subset of implementations that perform best.…”

Section: Many-core Graphics Processing Unitmentioning

confidence: 99%