Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 2013
DOI: 10.1145/2503210.2503298
|View full text |Cite
|
Sign up to set email alerts
|

Petascale direct numerical simulation of turbulent channel flow on up to 786K cores

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
45
0
1

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 81 publications
(47 citation statements)
references
References 16 publications
1
45
0
1
Order By: Relevance
“…(Lee et al, 2013) in BG/Q. Both codes are tested with grid size 3K×2K×1.5K in physical space using 1K MPI tasks and each task with 64 threads (16 cores with ×4 hyper-threading).…”
Section: A5 Performance Of the Code On Bg/q In Julich Supercomputimentioning
confidence: 99%
“…(Lee et al, 2013) in BG/Q. Both codes are tested with grid size 3K×2K×1.5K in physical space using 1K MPI tasks and each task with 64 threads (16 cores with ×4 hyper-threading).…”
Section: A5 Performance Of the Code On Bg/q In Julich Supercomputimentioning
confidence: 99%
“…Scaling will be efficient as long as the cost for MPI communication is smaller than all the other elementwise operations required to set up the right hand side of the explicit ODE given by Eq. (14). This is not a trivial task, considering that a pseudo-spectral solver demands that every process sends and receives data from all the other processes -at least for the slab decomposition.…”
Section: Parallel Scalingmentioning
confidence: 99%
“…However, the Nelder-Mead method of AH is originally designed to work in a multi-dimensional orthotope (hyperrectangle) parameter space. 2 So the AH server can generate an infeasible test configuration. We customized the AH client to report the worst performance value (infinity) immediately back to the AH server when the AH client receives an infeasible configuration.…”
Section: Customizationmentioning
confidence: 99%
“…Bell et al [13] and Nishtala et al [8] wrote a UPC-based code that requires hardware support for asynchronous communication. As shown in Section 5, the UPC-based code is not optimized because (1) it misses some computation to overlap with communication, (2) it uses a fixed size of communication messages, and (3) it uses the point-to-point communication rather than the optimized collective operation. Fang et al [22] lack portability as they use a special communication API called QMP and require hardware support for asynchronous communication.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation