2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PHD Forum 2011
DOI: 10.1109/ipdps.2011.221
|View full text |Cite
|
Sign up to set email alerts
|

ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective Communications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 15 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…As a principled approach to network offloading, sPIN has the potential to replace specific offload solutions such as ConnectX CORE-Direct collective offload [10], Cray Aries [2], IBM PERCS [11], or Portals 4 [12] triggered operations. Instead, the community can focus on developing domain or application-specific sPIN libraries to accelerate networking, very much like NVIDIA's cuBLAS or Vien-naCL [13].…”
Section: Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…As a principled approach to network offloading, sPIN has the potential to replace specific offload solutions such as ConnectX CORE-Direct collective offload [10], Cray Aries [2], IBM PERCS [11], or Portals 4 [12] triggered operations. Instead, the community can focus on developing domain or application-specific sPIN libraries to accelerate networking, very much like NVIDIA's cuBLAS or Vien-naCL [13].…”
Section: Motivationmentioning
confidence: 99%
“…All benefits known from collective offloading implementations [10,12,29] such as asynchronous progression and noiseresilience remain true for sPIN. As opposed to existing offloading frameworks that restrict the collective algorithms (e.g., to pre-defined trees), sPIN supports arbitrary algorithms (including pipeline and double-tree [30]) due to the flexible programmability and high forwarding performance of the HPUs.…”
Section: Spin Offloaded Broadcastmentioning
confidence: 99%
“…This management queue allows to delay certain operations until others are finished, and therefore to express dependencies between operations. Researchers reported positive results when implementing single collectives such as barrier and broadcast with this technology [26], [27]. To the best of our knowledge no one has attempted to show that the primitives offered by CORE-Direct are powerful enough to offload any communication schedule, or shown its limits.…”
Section: Experimental Evaluationmentioning
confidence: 99%
“…A big motivation for this work, as well as our previous collective algorithm work [16], is to develop collective communications that provide a high-degree of ability to overlap computation and communication leading to effective system utilization. While providing the ability to overlap and providing high-performance communication is essential, it is just as important to provide the basic capabilities needed to implement asynchronous communication algorithms.…”
Section: B Communication-computation Overlapmentioning
confidence: 99%