2019
DOI: 10.1177/1094342019842645
|View full text |Cite
|
Sign up to set email alerts
|

Studies on the energy and deep memory behaviour of a cache-oblivious, task-based hyperbolic PDE solver

Abstract: We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific focus on memory characteristics and energy needs. ExaHyPE combines dynamically adaptive mesh refinement (AMR) with ADER-DG. It is parallelized using tasks, and it is cache efficient. AMR plus ADER-DG yields a task graph which is highly dynamic in nature and comprises both arithmetically expensive tasks and tasks which challenge the memory's latency. The expensive tasks and thus the whole code benefit from AVX ve… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
23
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

4
3

Authors

Journals

citations
Cited by 16 publications
(23 citation statements)
references
References 11 publications
0
23
0
Order By: Relevance
“…In general good scalability can be observed on up to 14 cores at higher orders. All ExaHyPE codes employ a hybrid parallelisation strategy with at least two MPI ranks per node [55]. Results for this hybrid parallelisation strategy are provided in Section 6.6.…”
Section: Euler Equationsmentioning
confidence: 99%
“…In general good scalability can be observed on up to 14 cores at higher orders. All ExaHyPE codes employ a hybrid parallelisation strategy with at least two MPI ranks per node [55]. Results for this hybrid parallelisation strategy are provided in Section 6.6.…”
Section: Euler Equationsmentioning
confidence: 99%
“…If we study a linear variant of (1), we integrate the cell with the Cauchy--Kowalesvki procedure [14]. Here, the STP is significantly cheaper, though it still yields localized data access [8]. The time integration following the STP allows us to reuse the outcome data structure for all intermediate-in-time results.…”
Section: C77mentioning
confidence: 99%
“…Its computations per mesh cell are arithmetically intense, which is a property they share with many higher-order methods [25]. At the same time, DG's data access pattern however is very localized [11]---this helps to reduce the memory access stress [8,17,20,24,27]---and its exchange between cells along their connecting faces is conceptually simple. A combination of these two properties---high intensity to exploit vector units and dynamic adaptive mesh refinement (AMR) to invest where it pays off most---is a fit to predictions of what exascale software will have to look like [10].…”
mentioning
confidence: 99%
“…This allows us to single out a failing rank. The ∆t HB ensures that the system is not flooded with heartbeat messages and is not overly sensitive to small performance fluctuations [6].…”
Section: Implementation Decisionsmentioning
confidence: 99%