Route to exascale: Novel mathematical methods, scalable algorithms and Computational Science skills

Alexandrov, Vassil

doi:10.1016/j.jocs.2016.04.014

Cited by 7 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ideally, a heterogeneous application will minimize communication between the GPU and CPU, which effectively minimizes latency costs. Minimizing latency in high-performance computing is one of the barriers to exascale computing that requires the implementation of novel techniques to improve [5].…”

Section: Introductionmentioning

confidence: 99%

Applying the Swept Rule for Solving Two-Dimensional Partial Differential Equations on Heterogeneous Architectures

Walker

Niemeyer

2021

MCA

View full text Add to dashboard Cite

The partial differential equations describing compressible fluid flows can be notoriously difficult to resolve on a pragmatic scale and often require the use of high-performance computing systems and/or accelerators. However, these systems face scaling issues such as latency, the fixed cost of communicating information between devices in the system. The swept rule is a technique designed to minimize these costs by obtaining a solution to unsteady equations at as many possible spatial locations and times prior to communicating. In this study, we implemented and tested the swept rule for solving two-dimensional problems on heterogeneous computing systems across two distinct systems and three key parameters: problem size, GPU block size, and work distribution. Our solver showed a speedup range of 0.22–2.69 for the heat diffusion equation and 0.52–1.46 for the compressible Euler equations. We can conclude from this study that the swept rule offers both potential for speedups and slowdowns and that care should be taken when designing such a solver to maximize benefits. These results can help make decisions to maximize these benefits and inform designs.

show abstract

Section: Introductionmentioning

confidence: 99%

Applying the Swept Rule for Solving Two-Dimensional Partial Differential Equations on Heterogeneous Architectures

Walker

Niemeyer

2021

MCA

View full text Add to dashboard Cite

show abstract

“…Therefore, we must anticipate that multiscale simulations will become an increasingly important form of scientific application on high end computing resources, necessitating the development of sustainable and reusable solutions for such emerging applications, that is, generic algorithms for multiscale computing. As we move into the exascale performance era we need to drastically change the way we use HPC for simulation based sciences [25].…”

Section: Introductionmentioning

confidence: 99%

Multiscale computing in the exascale era

Alowayyed

Groen

Coveney

et al. 2017

Journal of Computational Science

View full text Add to dashboard Cite

We expect that multiscale simulations will be one of the main high performance computing workloads in the exascale era. We propose multiscale computing patterns as a generic vehicle to realise load balanced, fault tolerant and energy aware high performance multiscale computing. Multiscale computing patterns should lead to a separation of concerns, whereby application developers can compose multiscale models and execute multiscale simulations, while pattern software realises optimized, fault tolerant and energy aware multiscale computing. We introduce three multiscale computing patterns, present an example of the extreme scaling pattern, and discuss our vision of how this may shape multiscale computing in the exascale era.

show abstract

“…In many ways recent improvements in computational capacity have been sustained by the development of accelerators or co-processors, such as general purpose graphics processing units (GPGPUs) or the Intel Xeon Phi manycore processor, that augment the computational capabilities of the CPU. These devices have grown in power and complexity over the last two decades, leading to an increasing reliance on them for enabling efficient floating-point computation on HPC systems [1]. As these systems grow in complexity, computational power, and physical size, latency and bandwidth costs limit the performance of applications that require regular inter-node communicationsuch as CFD simulations.…”

Section: Introductionmentioning

confidence: 99%

“…to hide network and memory latency, have very high computation/communication overlap, have minimal communication, have fewer synchronization points", and "mathematical methods developed and corresponding scientific algorithms need to match these architectures [standard processors and GPGPUs] to extract the most performance. This includes different system-specific levels of parallelism as well as co-scheduling of computation" [1].…”

Section: Introductionmentioning

confidence: 99%

Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems

Magee¹,

Walker²

2018

Preprint

View full text Add to dashboard Cite

Applications that exploit the architectural details of high performance computing (HPC) systems have become increasingly invaluable in academia and industry over the past two decades. The most important hardware development of the last decade in HPC has been the General Purpose Graphics Processing Unit (GPGPU), a class of massively parallel devices that now contributes the majority of computational power in the top 500 supercomputers. As these systems grow, small costs such as latency-the fixed cost of memory accesses-accumulate in a large simulation and become a significant barrier to performance. The swept time-space decomposition rule is a communication-avoiding technique for time-stepping stencil update formulas that attempts to reduce latency costs. This work extends the swept rule by targeting heterogeneous, CPU/GPU architectures representative of current and future HPC systems. We compare our approach to a naive decomposition scheme with two test equations using an MPI+CUDA pattern on 40 processes over two nodes containing one GPU. We show that the swept rule produces a 4-18 × speedup with the heat equation and a 1.5-3.2 × speedup with the Euler equations using the same processors and work distribution. These results show the potential effectiveness of the swept rule for different equations and numerical schemes on massively parallel compute systems that incur substantial latency costs.

show abstract

Route to exascale: Novel mathematical methods, scalable algorithms and Computational Science skills

Cited by 7 publications

References 9 publications

Applying the Swept Rule for Solving Two-Dimensional Partial Differential Equations on Heterogeneous Architectures

Applying the Swept Rule for Solving Two-Dimensional Partial Differential Equations on Heterogeneous Architectures

Multiscale computing in the exascale era

Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems

Contact Info

Product

Resources

About