Sparse computations constitute one of the most important areas of numerical algebra and scientific computing. Because of indirect addressing, sparse codes exhibit irregular patterns of references to memory. While there are many studies performing high level optimizations on sparse computing, few deal with software prefetch. This is due to the irregular memory accesses which are incompatible with regular prefetch and it is due to the high efficiency and complexity of hardware prefetch units included in modern processors, like the Intel Core micro-architecture. In this paper, we show the efficiency and the limitations of hardware prefetch units, and we propose a technique to use software prefetch instructions in combination with hardware support to better manage cache and improve the overall code performance. To achieve this goal, the cache behavior of the sparse matrix vector multiplication (SpMV) is analyzed focusing on the code structure and the sequence order of the data. Main cache parameters are identified and their impact on the cache performance is evaluated. These parameters are included in a matrix analyzer to determine in advance the efficiency of the software prefetch. Furthermore, the software prefetch efficiency is analyzed on a large set of sparse matrices. Experimental results show an accurate prediction of the matrix analyzer and a maximum improvement of 40% in the execution time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.