2018
DOI: 10.1016/j.jpdc.2017.10.014
|View full text |Cite
|
Sign up to set email alerts
|

Scaling of a Fast Fourier Transform and a pseudo-spectral fluid solver up to 196608 cores

Abstract: In this paper we present scaling results of a FFT library, FFTK, and a pseudospectral code, Tarang, on grid resolutions up to 8192 3 grid using 65536 cores of Blue Gene/P and 196608 cores of Cray XC40 supercomputers. We observe that communication dominates computation, more so on the Cray XC40. The computation time scales as T comp ∼ p −1 , and the communication time as T comm ∼ n −γ2 with γ 2 ranging from 0.7 to 0.9 for Blue Gene/P, and from 0.43 to 0.73 for Cray XC40. FFTK, and the fluid and convection solve… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
72
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
8
1

Relationship

4
5

Authors

Journals

citations
Cited by 94 publications
(73 citation statements)
references
References 35 publications
1
72
0
Order By: Relevance
“…The grid corresponds to a cubical domain of unit dimension. The simulation was performed using a pseudo-spectral code 55,56 . Freeslip and isothermal boundary conditions were employed at the top and bottom plates, and periodic boundary conditions were employed at the side walls.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The grid corresponds to a cubical domain of unit dimension. The simulation was performed using a pseudo-spectral code 55,56 . Freeslip and isothermal boundary conditions were employed at the top and bottom plates, and periodic boundary conditions were employed at the side walls.…”
Section: Methodsmentioning
confidence: 99%
“…We list the values of the energy flux in Table III. We also compute the Fourier transform of our velocity and temperature field data, and compute the spectral energy flux using the following relation 55,56 :…”
Section: B Probability Distribution Function For Velocity Incrementsmentioning
confidence: 99%
“…where u is the velocity field, Ω = Ωẑ is the angular velocity of the rotating frame, p is the pressure field which includes contributions from centrifugal acceleration, ν is the kinematic viscosity, −2Ω × u is the Coriolis acceleration, and f is the force field. We have simulated these equations in a cube of size (2π) 3 with periodic boundary condition on all the sides using pseudo-spectral code, Tarang 60,61 . We have used fourth-order Runge-Kutta method for time stepping, and Courant-Friedrich-Lewy (CFL) condition to optimize the time stepping (∆t) and 2/3 rule for dealiasing.…”
Section: The Model Systemmentioning
confidence: 99%
“…The domain is decomposed in the vertical direction (a so-called 1D or slab decomposition) in such a way that the the vertical planes are evenly distributed to all MPI tasks (the slabs will be further decomposed into smaller domains using OpenMP, as described below). A relatively common alternative to this approach is to use a 2D "pencil" decomposition (Yeung et al, 2005;Chatterjee et al, 2018), whose performance implications were considered in M11. If P is the number of MPI tasks, there are M = N z /P planes of the global domain assigned as work to each task, and from the figure, it is clear that each task "owns" a slab of size N x × N y × M points.…”
Section: Problem Descriptionmentioning
confidence: 99%