2007
DOI: 10.1016/j.parco.2007.06.006
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing a conjugate gradient solver with non-blocking collective operations

Abstract: Abstract. This paper presents a case study about the applicability and usage of non-blocking collective operations. These operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. We introduce our NBC library, a portable lowoverhead implementation of non-blocking collectives on top of MPI-1. We demonstrate the easy usage of the NBC library with the optimization of a conjugate gradient solver with only minor changes to the traditional parallel implementat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
22
0

Year Published

2008
2008
2018
2018

Publication Types

Select...
4
2
2

Relationship

3
5

Authors

Journals

citations
Cited by 54 publications
(23 citation statements)
references
References 14 publications
1
22
0
Order By: Relevance
“…We therefore conclude that overlapping the neighbor exchange communication with steps 2 and 3 should show a reasonable performance benefit at any scale. Overlapping this kind of communication has been successfully demonstrated on a regular grid in [1]. We expect the irregular grid to achieve similar performance improvements which could result in a reduction of the communication overhead.…”
Section: Domain Parallelizationmentioning
confidence: 87%
See 1 more Smart Citation
“…We therefore conclude that overlapping the neighbor exchange communication with steps 2 and 3 should show a reasonable performance benefit at any scale. Overlapping this kind of communication has been successfully demonstrated on a regular grid in [1]. We expect the irregular grid to achieve similar performance improvements which could result in a reduction of the communication overhead.…”
Section: Domain Parallelizationmentioning
confidence: 87%
“…The basic equations of DFT are the static and time-dependent KohnSham equations: 1 Hϕ j = ε j ϕ j i ∂ ∂t ϕ j (t) = H(t)ϕ j (t)…”
Section: Introductionmentioning
confidence: 99%
“…Examples include [29] and can be found at the LibNBC webpage [30]. Other double-buffering based schemes to optimize parallel implementations of more algorithms (e.g.…”
Section: Discussionmentioning
confidence: 99%
“…A particular example with a three-dimensional Poisson equation showed a performance improvement of 34% by applying nonblocking collective operations [5]. The runtime of a strong-scaling medical image reconstruction algorithm [6] could be improved up to 8%.…”
Section: Nonblocking Collective Communicationmentioning
confidence: 99%