Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004.
DOI: 10.1109/icpads.2004.1316097
|View full text |Cite
|
Sign up to set email alerts
|

Scaling all-to-all multicast on fat-tree networks

Abstract: In this paper, we study the all-to-all

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
21
0

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(21 citation statements)
references
References 20 publications
0
21
0
Order By: Relevance
“…Most of the previous work [2,12,27,25,26,18,13,16] addresses congestion in the core (switches) of HPC networks. As our experimental evaluation shows, the advent of multicore processors introduces congestion at the edge of these networks and mechanisms to handle Concurrency Congestion are required for best performance on contemporary hardware.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Most of the previous work [2,12,27,25,26,18,13,16] addresses congestion in the core (switches) of HPC networks. As our experimental evaluation shows, the advent of multicore processors introduces congestion at the edge of these networks and mechanisms to handle Concurrency Congestion are required for best performance on contemporary hardware.…”
Section: Discussionmentioning
confidence: 99%
“…Yang and Wang [25,26] discussed algorithms for near optimal all-to-all broadcast on meshes and tori. Kumar and Kale [18] discussed algorithms to optimize all-to-all multicast on fat-tree networks. Dvorak et al [13] described techniques for topology aware scheduling of many-to-many collective operations.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Implementations can be designed to exploit the host machine's native network architecture, but a poor MPI implementation can be a source of serious performance problems in large-scale applications. For example, even on a high-bandwidth InfiniBand network, an implementation of collective operations such as multicast must avoid congestion to achieve good performance (Kumar and Kale, 2004).…”
mentioning
confidence: 99%
“…Many all-to-all broadcast algorithms were designed for specific network topologies that are used in parallel machines, including hypercube [8,20], mesh [15,18,21], torus [21], k-ary n-cube [20], fat tree [10], and star [14]. Work in [9] optimizes MPI collective communications, including M P I Allgather, on wide area networks.…”
Section: Related Workmentioning
confidence: 99%