Proceedings of the 19th Annual International Conference on Supercomputing 2005
DOI: 10.1145/1088149.1088202
|View full text |Cite
|
Sign up to set email alerts
|

Automatic generation and tuning of MPI collective communication routines

Abstract: In order for collective communication routines to achieve high performance on different platforms, they must be able to adapt to the system architecture and use different algorithms for different situations. Current Message Passing Interface (MPI) implementations, such as MPICH and LAM/MPI, are not fully adaptable to the system architecture and are not able to achieve high performance on many platforms. In this paper, we present a system that produces efficient MPI collective communication routines. By automat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
66
0
1

Year Published

2006
2006
2008
2008

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 78 publications
(67 citation statements)
references
References 20 publications
0
66
0
1
Order By: Relevance
“…The results for topologies (2) and (3) are similar to those for topology (4). Since all algorithms run over MPICH except LAM, we also include a binomial tree implementation (the algorithm used in LAM) over MPICH in the comparison.…”
Section: Methodsmentioning
confidence: 78%
See 2 more Smart Citations
“…The results for topologies (2) and (3) are similar to those for topology (4). Since all algorithms run over MPICH except LAM, we also include a binomial tree implementation (the algorithm used in LAM) over MPICH in the comparison.…”
Section: Methodsmentioning
confidence: 78%
“…Figure 8 shows the performance of pipelined broadcast using different contention-free trees on topology (1). The performance of pipelined broadcast on topologies (2), (3), (4), and (5) has a similar trend. As can be seen from the figure, when the message size is large (≥ 32KB), the linear tree offers the best performance.…”
Section: Methodsmentioning
confidence: 78%
See 1 more Smart Citation
“…Instead, we develop practical techniques to facilitate the deployment of pipelined broadcast on clusters connected by multiple Ethernet switches. Similar to other architecture specific collective communication algorithms [8,10,17], the techniques developed in this paper can be used in advanced communication libraries [7,9,13,30]. Our research extends the work in [11,23,28] by considering multiple switches.…”
Section: Related Workmentioning
confidence: 75%
“…There are techniques to develop adaptive MPI routines that use different algorithms according to the message sizes [7,20]. These adaptive techniques allow our algorithms and the complementary algorithms for broadcasting small messages to co-exist in one MPI routine.…”
Section: Related Workmentioning
confidence: 99%