Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 2015
DOI: 10.1145/2807591.2807602
|View full text |Cite
|
Sign up to set email alerts
|

Improving concurrency and asynchrony in multithreaded MPI applications using software offloading

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 30 publications
(9 citation statements)
references
References 22 publications
0
9
0
Order By: Relevance
“…An alternative way of improving the overlap is using MPI+OpenACC+OpenMP, in which OpenMP is used to generate multiple threads. These threads can work on different tasks such as computation and communication so that the actual degree of overlap can be increased 30‐34 . In fact, there are more literature discussing how to improve the overlap performance and almost all of them use multiple threads.…”
Section: Resultsmentioning
confidence: 99%
“…An alternative way of improving the overlap is using MPI+OpenACC+OpenMP, in which OpenMP is used to generate multiple threads. These threads can work on different tasks such as computation and communication so that the actual degree of overlap can be increased 30‐34 . In fact, there are more literature discussing how to improve the overlap performance and almost all of them use multiple threads.…”
Section: Resultsmentioning
confidence: 99%
“…Vaidyanathan et al [23] contributed an approach for asynchronous progress in the "MPI+X" model by utilizing a dedicated thread together with a lock-free command queue. The "MPI+X" model often utilizes multiple threads over multi-or many-core systems to parallelize computation and employs only a single MPI process per node for internode communication.…”
Section: Communication Asynchronous Progressmentioning
confidence: 99%
“…Some MPI implementations also offer specific options for truly asynchronous progress (Intel, 2017; Pritchard et al, 2012b). However, these specific options do not enable performance portability, and such asynchronous progress can also require the maximum thread support level (MPI_THREAD_MULTIPLE (Pritchard et al, 2012a)) which can imply some performance overhead as shown in Vaidyanathan et al (2015).…”
Section: Deployment and Comparison On Multiple Nodesmentioning
confidence: 99%