2020
DOI: 10.1007/s11227-020-03186-1
|View full text |Cite
|
Sign up to set email alerts
|

Performance modeling of the sparse matrix–vector product via convolutional neural networks

Abstract: Modeling the execution time of the Sparse Matrix-Vector multiplication (SpMV) on a current CPU architecture is especially complex due to i) irregular memory accesses; ii) indirect memory referencing; and iii) low arithmetic intensity. While analytical models may yield accurate estimates for the total number of cache hits/misses, they often fail to predict accurately the total execution time. In this paper, we depart from the analytic approach to instead leverage Convolutional Neural Networks (CNNs) in order to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 22 publications
0
10
0
Order By: Relevance
“…Similarly, 80% of the training dataset was utilized only for training, and the remaining 20% was used for validation, to guide the training process. In Barreda et al (2020c) we showed that the trained models for the time metric using this dataset provide an appropriate generalization power. Using too much training data could lead to a model which overfits the problem.…”
Section: Obtaining the Datasetmentioning
confidence: 86%
See 3 more Smart Citations
“…Similarly, 80% of the training dataset was utilized only for training, and the remaining 20% was used for validation, to guide the training process. In Barreda et al (2020c) we showed that the trained models for the time metric using this dataset provide an appropriate generalization power. Using too much training data could lead to a model which overfits the problem.…”
Section: Obtaining the Datasetmentioning
confidence: 86%
“…The strategy of partitioning the vpos 0 array into blocks forces us to implement a blockwise version of the classic CSR-based SPMV Algorithm 1 to generate the training dataset for the CNNs, in which each block of vpos 0 is labeled with its corresponding ratios of execution time and energy consumption (total, package, and DRAM) per nonzero element. In a previous work (Barreda et al, 2020c), we analyzed the impact of the block size in the time predictions and concluded that the proposed network architectures deliver accurate results for small block sizes. The reason is that, in general, small blocks reflect a small set of sparsity patterns which, in turn, can be better captured by the CNN filters.…”
Section: Methodsmentioning
confidence: 96%
See 2 more Smart Citations
“…Other interesting efforts related with this topic are the article by Williams et al (2007) where the authors study some optimization techniques for the SpMV kernel over several hardware platforms; the proposal by Erguiz et al (2017) with advances over the automatic selection of different sparse triangular linear solvers on GPU; and the work by Barreda et al (2020) which offers a performance modeling of the SpMV kernel via convolutional neural networks with ARM as the target hardware platform.…”
Section: Automatic Tuning and Performance Models For The Spmv In Gpusmentioning
confidence: 99%