2006
DOI: 10.1007/11846802_36
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Memory Optimizations for Improving MPI Derived Datatype Performance

Abstract: Abstract. MPI derived datatypes allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. This powerful feature enables an MPI implementation to optimize the transfer of noncontiguous data. In practice, however, many implementations of MPI derived datatypes perform poorly, which makes application developers avoid using this feature. In this paper, we present a technique to automatically select templates that are optimized for memory performance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2009
2009
2019
2019

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 12 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…Although better results have been reported by, for instance, Hoefler and Gottlieb (2010) and Lu et al (2004), this inconsistency seems to hamper the widespread use of derived datatypes. This, despite a considerable amount of work by Byna et al (2006), Ross et al (2003), Schneider et al (2013) and Wu et al (2004) among others addressing the issue. Schneider et al (2013) also demonstrate runtime compilation for MPI datatypes.…”
Section: Background and Related Workmentioning
confidence: 97%
“…Although better results have been reported by, for instance, Hoefler and Gottlieb (2010) and Lu et al (2004), this inconsistency seems to hamper the widespread use of derived datatypes. This, despite a considerable amount of work by Byna et al (2006), Ross et al (2003), Schneider et al (2013) and Wu et al (2004) among others addressing the issue. Schneider et al (2013) also demonstrate runtime compilation for MPI datatypes.…”
Section: Background and Related Workmentioning
confidence: 97%
“…The same is true for codes which perform "manual packing" of data before sending, a practice widespread in old HPC codes, due to the fact that a compiler can optimize the packing loops of specialized codes, e.g., utilize vector instructions to copy blocks, while a simple MPI implementation might only provide a non-specialized generic DDT interpreter. However, much effort has been directed into optimizing MPI DDT implementations, and using them often gives superior performance [21][22][23]. Fig.…”
Section: Mpi Derived Datatypes the Simplest Mpi Derivedmentioning
confidence: 99%
“…Several works have been focusing on optimizing MPI datatypes with different approaches: Traff et al [48] show that some complex derived datatypes can be transformed into simpler ones, improving the packing and unpacking performance with a more compact representation. Gropp et al [21] provide a classification of the MPI derived datatypes based on their memory access patterns, discussing how they can be efficiently implemented or automatically optimized [22]. Both approaches are orthogonal to this work because they propose optimizations that can be applied before offloading and can be integrated in the offloaded handlers.…”
Section: Related Workmentioning
confidence: 99%
“…Although better results have been reported [3,7], this inconsistency seems to hamper the widespread use of derived datatypes. This, despite a considerable amount of work [2,12,13,15] addressing the issue. In [13], Schneider et al also demonstrate runtime compilation for MPI datatypes.…”
Section: Mpi Derived Datatypesmentioning
confidence: 99%