2018
DOI: 10.1002/cpe.4470
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating explicit ODE methods on GPUs by kernel fusion

Abstract: Graphics processing units (GPUs) have a promising architecture for implementing highly parallel solution methods for systems of ordinary differential equations (ODEs). However, their high performance comes at the price of caveats such as small caches or wide SIMD. For ODE methods, optimizing the memory access pattern is often crucial. In this article, instead of considering only one specific method, we generalize the description of explicit ODE methods by using data flow graphs consisting of basic operations t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 32 publications
0
6
0
Order By: Relevance
“…The application of kernel fusion to ODE methods on GPUs for general ODE systems was also considered 2 . For those systems it is only allowed to fuse RHS → LC, RHS → RED, LC → LC and LC → RED dependencies, while a global barrier is required for each LC → RHS dependency.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…The application of kernel fusion to ODE methods on GPUs for general ODE systems was also considered 2 . For those systems it is only allowed to fuse RHS → LC, RHS → RED, LC → LC and LC → RED dependencies, while a global barrier is required for each LC → RHS dependency.…”
Section: Related Workmentioning
confidence: 99%
“…We have added the ability to generate multi‐workgroup tilings for explicit one‐step methods along a user defined dependency chain to our automatic prototype framework 2,3 . This framework allows a user to solve an arbitrary IVP by an arbitrary explicit ODE method of several supported classes (RK methods, PIRK methods, peer methods, Adams–Bashforth methods).…”
Section: Experimental Evaluationmentioning
confidence: 99%
See 2 more Smart Citations
“…Another known approach is kernel fusion, a technique to fuse multiple memory intensive ops with data dependencies into a single kernel to reduce off chip memory accesses. Prior works have explored this idea extensively in database [32], image processing [5,16,23], HPC applications [18,30], and AI workloads [7,19]. However, there are two notable limitations when targeting memory intensive DL models.…”
Section: Introductionmentioning
confidence: 99%