2009 IEEE International Symposium on Parallel &Amp; Distributed Processing 2009
DOI: 10.1109/ipdps.2009.5161025
|View full text |Cite
|
Sign up to set email alerts
|

Scalable RDMA performance in PGAS languages

Abstract: Partitioned Global Address Space (PGAS) languages provide a unique programming model that can span shared-memory multiprocessor (SMP) architectures, distributed memory machines, or cluster of SMPs. Users can program large scale machines with easy-to-use, shared memory paradigms.In order to exploit large scale machines efficiently, PGAS language implementations and their runtime system must be designed for scalability and performance. The IBM XLUPC compiler and runtime system provide a scalable design through t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2010
2010
2016
2016

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(8 citation statements)
references
References 14 publications
0
8
0
Order By: Relevance
“…We incur negligible performance loss or marginal performance gain for all benchmarks ( tered until MPI Finalize) slightly improves performance (1.12%) for CG. On a system with a higher deregistration cost, such as Myrinet/GM [7], we expect a larger performance improvement. Figure 11(f) displays the scenario we have shown in section 1.…”
Section: Nas Parallel Benchmarksmentioning
confidence: 98%
See 1 more Smart Citation
“…We incur negligible performance loss or marginal performance gain for all benchmarks ( tered until MPI Finalize) slightly improves performance (1.12%) for CG. On a system with a higher deregistration cost, such as Myrinet/GM [7], we expect a larger performance improvement. Figure 11(f) displays the scenario we have shown in section 1.…”
Section: Nas Parallel Benchmarksmentioning
confidence: 98%
“…They also batch deregistrations to reduce the average cost. Farreras et al [7] proposed a pin-down cache for Myrinet. They delay deregistration and cache registration information for future accesses to the same memory region.…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand, the IBM XLUPC compiler and runtime system uses a shared variable directory (SVD) to share the location of shared variables. The runtime system employs a local cache to reduce SVD accesses and allow RDMA accesses [11]. This is designed for large scale system and does not particularly address multi-and many-core systems that have lower latency.…”
Section: Related Workmentioning
confidence: 99%
“…For Myrinet networks GASNet provides a conduit for the GM driver [5], as does the IBM APGAS runtime [18]. GM is a legacy low-level messaging system for Myrinet network which was replaced by MX.…”
Section: Related Workmentioning
confidence: 99%