2022
DOI: 10.1029/2021ms002684
|View full text |Cite
|
Sign up to set email alerts
|

Fluid Simulations Accelerated With 16 Bits: Approaching 4x Speedup on A64FX by Squeezing ShallowWaters.jl Into Float16

Abstract: Most Earth‐system simulations run on conventional central processing units in 64‐bit double precision floating‐point numbers Float64, although the need for high‐precision calculations in the presence of large uncertainties has been questioned. Fugaku, currently the world's fastest supercomputer, is based on A64FX microprocessors, which also support the 16‐bit low‐precision format Float16. We investigate the Float16 performance on A64FX with ShallowWaters.jl, the first fluid circulation model that runs entirely… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 69 publications
0
6
0
Order By: Relevance
“…Speed‐ups, see Figure 14b), for larger grid sizes lie between 3.8 and 4.2 consistently. For more information on half precision performance on the A64FX for physical models, see (Klöwer et al., 2022).…”
Section: Resultsmentioning
confidence: 99%
“…Speed‐ups, see Figure 14b), for larger grid sizes lie between 3.8 and 4.2 consistently. For more information on half precision performance on the A64FX for physical models, see (Klöwer et al., 2022).…”
Section: Resultsmentioning
confidence: 99%
“…For simpler and smaller codes like SPEEDY, it is possible to identify such precision bottlenecks. In this case, one could rescale the relevant equations to ensure that the float operations can be described by the necessary number format as regards dynamic range and precision (see, e.g., Klöwer et al, 2022). Conversely, for larger, more complex codes, rescaling equations and refactoring code can be prohibitively difficult.…”
Section: Discussionmentioning
confidence: 99%
“…Numerical stability and performance usually dictate this choice, but low precision can add an additional constraint: Using a shorter time step can cause stagnation as tendencies are too small to be added in the time integration. Stagnation from low precision can be overcome with a compensated time integration 61 or with stochastic rounding 32 . www.nature.com/scientificreports/ Longer orbits with more variables.…”
Section: Orbits In the Lorenz 1996 Systemmentioning
confidence: 99%