18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
DOI: 10.1109/ipdps.2004.1303181
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing the steady-state throughput of scatter and reduce operations on heterogeneous platforms

Abstract: In this paper, we consider the communications involved by the execution of a complex application, deployed on a heterogeneous large-scale distributed platform. Such applications intensively use collective macro-communication schemes, such as scatters, personalized all-to-alls or gather/reduce operations. Rather than aiming at minimizing the execution time of a single macro-communication, we focus on the steady-state operation. We assume that there is a large number of macro-communications to perform in pipelin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 9 publications
0
7
0
Order By: Relevance
“…Legrand, Marchal and Robert [11] study steady-state situations where a series of reductions are performed. As in our work, the reduction operation is assumed to be indivisible, transfers and computations can overlap and the full-duplex one-port model is considered.…”
Section: Related Workmentioning
confidence: 99%
“…Legrand, Marchal and Robert [11] study steady-state situations where a series of reductions are performed. As in our work, the reduction operation is assumed to be indivisible, transfers and computations can overlap and the full-duplex one-port model is considered.…”
Section: Related Workmentioning
confidence: 99%
“…But the general problem is very difficult: we have proved that determining the optimal throughput is a NP-complete problem. This negative result demonstrates that pipelining multicasts is more difficult than pipelining broadcasts, scatters or reduce operations, for which optimal polynomial algorithms have been introduced [6,22]. In particular, although the broadcast and the multicast problems seem very similar (the target set is the only difference), there is a complexity gap between them: the best throughput to broadcast message to all nodes in the platform can be found in polynomial time, whereas finding the best throughput to multicast a message to a strict subset of these nodes is NP-hard.…”
Section: Resultsmentioning
confidence: 99%
“…From the solution of the Multicast-UB(P, P target ) linear program, a solution achieving the same throughput can be easily obtained. The interested reader may refer to [22,21] where the reconstruction scheme from the linear program is presented. In this section, we prove that such a solution may differ at most by a factor |P target | from the solution of the throughput given by the solution of Multicast-LB(P, P target ), thus providing a heuristic with a guaranteed factor |P target |.…”
Section: Distance Between Lower and Upper Boundsmentioning
confidence: 99%
See 1 more Smart Citation
“…Reductions have been used in distributed programs for years, and standards such as Message Passing Interface (MPI) usually include a ‘reduce’ function together with other collective communications (see for experimental comparisons). Many algorithms have been introduced to optimize this operation on various platforms, with homogeneous or heterogeneous communication costs . Recently, this operation has received more attention because of the success of the MapReduce framework , which has been popularized by Google.…”
Section: Introductionmentioning
confidence: 99%