Comments on CFD code performance on scalable architectures

Behr, Marek; Pressel, Daniel; Sturek, Walter B.

doi:10.1016/s0045-7825(00)00201-2

Cited by 16 publications

(5 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is hoped that this report and, in particular, the figures and data tables will enable the reader to better evaluate the merits of these systems in relation to his or her needs. Behr et al (2000). Hisley et al (1998).…”

Section: Discussionmentioning

confidence: 99%

A Comparison of the Performance of Two Popular Symmetric Multiprocessors When Used to Run High Performance Computing Applications

Pressel¹,

Schraml²,

Thompson³

et al. 2002

View full text Add to dashboard Cite

show abstract

Section: Discussionmentioning

confidence: 99%

A Comparison of the Performance of Two Popular Symmetric Multiprocessors When Used to Run High Performance Computing Applications

Pressel¹,

Schraml²,

Thompson³

et al. 2002

View full text Add to dashboard Cite

show abstract

“…The parallel implementation is based on the Message-Passing Interface, as well as iterative solution techniques, in particular, GMRES [9]. More details about this implementation, as well as the results of the architecture comparison, have been presented in Reference [10].…”

Section: Numerical Examplementioning

confidence: 99%

Stabilized space‐time finite element formulations for free‐surface flows

Behr

2001

Commun. Numer. Meth. Engng.

Self Cite

View full text Add to dashboard Cite

SUMMARYAspects of a method for 3D ÿnite element computation of unsteady, incompressible free-surface ow are presented. The approach is based on the deformable-spatial-domain=stabilized space-time (DSD=SST) ÿnite element formulation, which takes automatically into account the deformation of the elements in response to the motion of the free surface. The free-surface elevation is governed by a kinematic free-surface condition, which is also solved with a stabilized formulation. A new governing equation and stabilized formulation is derived for cases where the channel walls are not vertical. The parallel implementation based on MPI message-passing standard is fully portable, and have been demonstrated to be scalable on a range of architectures. A 3D computation of a ow past a spillway of a dam is shown as an example application.

show abstract

“…Under unfavorable circumstances, it can easily exceed 1-million cycles. To keep the cost of the overhead down to no more than 1% of the total CPU time, the parallel a For additional details, see Behr et al (2000).…”

Section: Parallelization Costsmentioning

confidence: 99%

The Scalability of Loop-Level Parallelism

Pressel¹

2001

View full text Add to dashboard Cite

This report deals with the four main constraints on the scalability of programs parallelized using loop-level parallelism. They are as follows: (1) The available parallelism in the algorithm. (2) The availability and scalability of appropriate hardware (including the operating system and the compilers). (3) Limitations in the design of the hardware. (4) The cost of getting into and out of a parallel section of code. This, in turn, will lead to two important discussions: (1) the theoretical limitations on the scalability of shared memory codes and (2) the role that the choice of hardware and usage policies play in determining the performance of a shared memory code. These discussions will include examples from the author's own work in porting the implicit computational fluid dynamics code F3D from the Cray C90 to a variety of shared memory platforms. iii Acknowledgments The author thanks Marek Behr, formerly of the U.S. Army High Performance Computing Research Center (AHPCRC), for sharing his results and the many colleagues who worked on these research projects over the years and helped collect this data and prepare this report. The author would also like to thank the employees of Business Plus, especially Claudia Coleman and Maria Brady, who assisted in the preparation and editing of this report. Special thanks to Tom Kendall, Denice Brown, and the entire systems staff at the ARL-MSRC for their support of the various projects for which these runs were originally done.

show abstract

Comments on CFD code performance on scalable architectures

Cited by 16 publications

References 8 publications

A Comparison of the Performance of Two Popular Symmetric Multiprocessors When Used to Run High Performance Computing Applications

A Comparison of the Performance of Two Popular Symmetric Multiprocessors When Used to Run High Performance Computing Applications

Stabilized space‐time finite element formulations for free‐surface flows

The Scalability of Loop-Level Parallelism

Contact Info

Product

Resources

About