2021
DOI: 10.48550/arxiv.2103.11991
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Kokkos Kernels: Performance Portable Sparse/Dense Linear Algebra and Graph Kernels

Abstract: As hardware architectures are evolving in the push towards exascale, developing Computational Science and Engineering (CSE) applications depend on performance portable approaches for sustainable software development. This paper describes one aspect of performance portability with respect to developing a portable library of kernels that serve the needs of several CSE applications and software frameworks. We describe Kokkos Kernels, a library of kernels for sparse linear algebra, dense linear algebra and graph k… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…One can solve these dense linear systems effectively using modern manycore CPUs and GPUs [5], in fact this area of research has been the focus of the community for past several years. Libraries such as Kokkos Kernels [6], MAGMA [7], cuSOLVER provide implementations of such solvers. However, several recent formulations result in these small systems themselves being sparse.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…One can solve these dense linear systems effectively using modern manycore CPUs and GPUs [5], in fact this area of research has been the focus of the community for past several years. Libraries such as Kokkos Kernels [6], MAGMA [7], cuSOLVER provide implementations of such solvers. However, several recent formulations result in these small systems themselves being sparse.…”
Section: Introductionmentioning
confidence: 99%
“…• A performance portable implementation of these solvers using the Kokkos library made available publicly in the Kokkos Kernels library [6].…”
Section: Introductionmentioning
confidence: 99%
“…Based on[26] that also uses the DAG layer partitioning, but instead of using global barriers, specialized point-to-point sparsified barriers are used for only the threads involved in dependencies, avoiding unnecessary stalls of the rest of the threads. Performance of the sparse kernels from the portable Kokkos library V3.4.1[27] is benchmarked.…”
mentioning
confidence: 99%
“…In the wake of the rising importance of graph-based computations, the hardware landscape within the compute industry began to undergo key shifts. Traditional Central Processing Units (CPUs), initially designed for sequential tasks, started incorporating SIMD-based graph extensions to enhance parallel processing capabilities [215].Graphics Processing Units (GPUs), with their inherent parallelism, were enhanced with kernel support tailored specifically for graph algorithms [148,174]. Beyond these general-purpose processors, the industry also witnessed the advent of domain-specific accelerators [86,115,153,202], specifically crafted to speedup graph computations, addressing the unique challenges and demands that graph algorithms present.…”
Section: Background and Motivationmentioning
confidence: 99%