2016
DOI: 10.1007/s10766-016-0464-z
|View full text |Cite
|
Sign up to set email alerts
|

GHOST: Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems

Abstract: While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring "standard" as well as "accelerated" resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi. Any software infrastructure that claims usefulness for such environments must be able to me… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
48
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
6
2

Relationship

6
2

Authors

Journals

citations
Cited by 40 publications
(48 citation statements)
references
References 45 publications
0
48
0
Order By: Relevance
“…6: the application of the polynomial filter. A key feature of our implementation is the use of sparse matrix multiplevector multiplication (spMMVM) as provided by the GHOST library [29], where the sparse matrix is applied simultaneously to several vectors. As we demonstrated previously for the KPM [6] and a block Jacobi-Davidson algorithm [39] the reduction of memory traffic in spMMVM can lead to significant performance gains over multiple independent spMVMs, where the matrix has to be reloaded from memory repeatedly.…”
Section: Parallel Implementation and Performance Engineeringmentioning
confidence: 99%
See 1 more Smart Citation
“…6: the application of the polynomial filter. A key feature of our implementation is the use of sparse matrix multiplevector multiplication (spMMVM) as provided by the GHOST library [29], where the sparse matrix is applied simultaneously to several vectors. As we demonstrated previously for the KPM [6] and a block Jacobi-Davidson algorithm [39] the reduction of memory traffic in spMMVM can lead to significant performance gains over multiple independent spMVMs, where the matrix has to be reloaded from memory repeatedly.…”
Section: Parallel Implementation and Performance Engineeringmentioning
confidence: 99%
“…In Sec. 4 we describe the main performance engineering steps for our GHOST [29]-based ChebFD implementation, which we use for the large-scale application studies in Sec. 5.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, Block Krylov methods are receiving an increasing attention in the HPC field [4,1,27,36,30]. They appear to be well suited for modern computers' architectures with a high level of parallelism because they allow to reduce the number of global synchronizations, while also featuring a higher arithmetic intensity at the cost of some extra computations.…”
Section: Block Krylov Methodsmentioning
confidence: 99%
“…Given n, t such that t n, we denote V, W tall and skinny matrices of size n × t whose rows are distributed among the processors, and α is a matrix of size t × t replicated on the P processors. Following [30], it is possible to decompose the iterations of ECG (and more generally block CG) into the following kernels:…”
Section: Cost Analysis Of Ecgmentioning
confidence: 99%
“…In addition to the standard types, the CRAFT library was endowed with support for GHOST sparse matrix data types [18], Phist sparse matrix data types [21], and Intel MKL complex data types. These extensions are part of the downloadable code [12].…”
Section: Additional Cr Extensionsmentioning
confidence: 99%