Proceedings 16th International Parallel and Distributed Processing Symposium 2002
DOI: 10.1109/ipdps.2002.1016563
|View full text |Cite
|
Sign up to set email alerts
|

Protocols and strategies for optimizing performance of remote memory operations on clusters

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2003
2003
2012
2012

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(25 citation statements)
references
References 9 publications
0
21
0
Order By: Relevance
“…Our work on GASNet [20] and that of Nieplocha et al on ARMCI [35] show that in fact one-sided communication can often outperform two-sided message-passing communication. Moreover, the results of this paper in the context of Titanium and others in the context of CAF [37] and UPC [5] show that these performance advantages carry over to application-level performance in the NAS benchmarks.…”
Section: Zplmentioning
confidence: 99%
“…Our work on GASNet [20] and that of Nieplocha et al on ARMCI [35] show that in fact one-sided communication can often outperform two-sided message-passing communication. Moreover, the results of this paper in the context of Titanium and others in the context of CAF [37] and UPC [5] show that these performance advantages carry over to application-level performance in the NAS benchmarks.…”
Section: Zplmentioning
confidence: 99%
“…Our original XLUPC Myrinet port implements multiple transfer protocols depending on message size [20]. Short messages are copied to avoid memory registration costs.…”
Section: Considerations For the Myrinet/gm Implementationmentioning
confidence: 99%
“…MPI implementations like OpenMPI [22] and MVAPICH [27] as well as one-sided messaging systems like ARMCI [20] follow a differential approach based on message size, switching between preallocated registered memory buffers (Bounce Buffers) for short messages and dynamic memory registration and de-registration as part of each data transfer (Rendezvous) for large ones. The crossover point between the protocols is dependent on the underlying network hardware and software, requiring tuning for each machine.…”
Section: Related Workmentioning
confidence: 99%
“…ARMCI provides an excellent implementation substrate for global address space languages making use of coarse-grain communication because it achieves high performance on a variety of networks (including Myrinet, Quadrics, and IBM's switch fabric for its SP systems) as well as shared memory platforms (Cray X1, SGI Altix3000, SGI Origin2000) while insulating its clients from platform-specific implementation issues such as shared memory, threads, and DMA engines. A notable feature of ARMCI is its support for non-contiguous data transfers [12].…”
Section: Communication Librariesmentioning
confidence: 99%