Samir Ammenouche scite author profile

Sparse computations constitute one of the most important areas of numerical algebra and scientific computing. Because of indirect addressing, sparse codes exhibit irregular patterns of references to memory. While there are many studies performing high level optimizations on sparse computing, few deal with software prefetch. This is due to the irregular memory accesses which are incompatible with regular prefetch and it is due to the high efficiency and complexity of hardware prefetch units included in modern processors, like the Intel Core micro-architecture. In this paper, we show the efficiency and the limitations of hardware prefetch units, and we propose a technique to use software prefetch instructions in combination with hardware support to better manage cache and improve the overall code performance. To achieve this goal, the cache behavior of the sparse matrix vector multiplication (SpMV) is analyzed focusing on the code structure and the sequence order of the data. Main cache parameters are identified and their impact on the cache performance is evaluated. These parameters are included in a matrix analyzer to determine in advance the efficiency of the software prefetch. Furthermore, the software prefetch efficiency is analyzed on a large set of sparse matrices. Experimental results show an accurate prediction of the matrix analyzer and a maximum improvement of 40% in the execution time.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Samir Ammenouche

On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors

Software prefetch on core micro-architecture applied to irregular codes

Contact Info

Product

Resources

About