2015 IEEE International Parallel and Distributed Processing Symposium 2015
DOI: 10.1109/ipdps.2015.98
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing Sparse Matrix Operations on GPUs Using Merge Path

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 28 publications
(12 citation statements)
references
References 13 publications
0
12
0
Order By: Relevance
“…However, it can be related indirectly with the consumption of auxiliary memory or the length of rows and the data locality. Aspect ratio ( m / n ): Short and wide matrices can present a strong variation in row lengths which can cause load imbalance issues among threads or warps. Works such as Bell and Garland (2008) and Dalton et al (2015) have observed that the performance of certain kernels is strongly sensitive to this factor. Number of nonzeros ( n n z ): The number of nonzeros is a fair estimation of the amount of work to be performed, since at least a multiplication and an addition are necessary for each nonzero. Maximum number of nonzeros per row ( n n z max ): The presence of rows much longer than the average can be an important performance problem for kernels that organize the workload row-wise. Minimum number of nonzeros per row ( n n z min ): This feature does not seem determinant, but we include it for completeness. Average number of nonzeros per row ( n n z a v g ): In kernels that organize the workload row-wise, this feature estimates the average workload per computational unit. Standard deviation of nonzeros per row ( n n z s t d ): Together with n n z max and n n z a v g , this feature aims to estimate the load imbalance between computational units during the execution of the...…”
Section: Automatic Methods Selectionmentioning
confidence: 99%
See 3 more Smart Citations
“…However, it can be related indirectly with the consumption of auxiliary memory or the length of rows and the data locality. Aspect ratio ( m / n ): Short and wide matrices can present a strong variation in row lengths which can cause load imbalance issues among threads or warps. Works such as Bell and Garland (2008) and Dalton et al (2015) have observed that the performance of certain kernels is strongly sensitive to this factor. Number of nonzeros ( n n z ): The number of nonzeros is a fair estimation of the amount of work to be performed, since at least a multiplication and an addition are necessary for each nonzero. Maximum number of nonzeros per row ( n n z max ): The presence of rows much longer than the average can be an important performance problem for kernels that organize the workload row-wise. Minimum number of nonzeros per row ( n n z min ): This feature does not seem determinant, but we include it for completeness. Average number of nonzeros per row ( n n z a v g ): In kernels that organize the workload row-wise, this feature estimates the average workload per computational unit. Standard deviation of nonzeros per row ( n n z s t d ): Together with n n z max and n n z a v g , this feature aims to estimate the load imbalance between computational units during the execution of the...…”
Section: Automatic Methods Selectionmentioning
confidence: 99%
“…Aspect ratio ( m / n ): Short and wide matrices can present a strong variation in row lengths which can cause load imbalance issues among threads or warps. Works such as Bell and Garland (2008) and Dalton et al (2015) have observed that the performance of certain kernels is strongly sensitive to this factor.…”
Section: Automatic Methods Selectionmentioning
confidence: 99%
See 2 more Smart Citations
“…The optimization of the SpMV kernel for manycore GPUs remains a topic of major interest [5,9,12]. Many of the most recent algorithm developments increase the efficiency by using prefix-sum computations [13] and intra-warp communication [10] on modern manycore hardware.…”
Section: Review Of Sparse Matrix Formatsmentioning
confidence: 99%