We are developing parallel programming models that are complementary to related projects and respond to unaddressed needs in the parallel computing community. These needs include incremental or partial migration of applications and their expert programmers from MPI, and efficient support for high-volume, random, fine-grained parallelism.A programming model provides an abstraction for expression of parallelism in applications. This abstraction must be at an appropriate level such that inherent parallelism can be mapped to capabilities of the underlying hardware. MPI is the de facto standard for high performance computing, mainly because its abstraction perfectly matches distributed memory architectures. However, it is difficult to directly express certain types of parallelism, such as parallel graph algorithms. Meanwhile, PetaFLOP-scale hardware is approaching. Vendors are developing multi-core processors. MPI may not be a suitable programming model for these new architectectures.GAS models are more expressive than MPI. These models can be realized as libraries (such as SHMEM, MPI-2, Portals) that are callable from conventional languages, or as language extensions (such as UPC, Co-Array Fortran). Existing GAS models typically support one of two levels of abstraction: one-sided communication, which allows a processor to access another processor's memory without the remote processor's cooperation, or distributed shared memory, which provides a logically global view of the data. Accesses to shared data in other processors require communication, which is more expensive than access to local data. Too much fine-grained communication can cause significant performance penalty due to communication latency in each separate transaction. For good performance, users need to manage data locality carefully to minimize fine-grained communication. Without system level support, the task of data locality management can diminish the convenience intended by this programming model. Offering ad hoc support for random communication patterns is not enough; it leads to a large, ever-increasing number of such utilities, and again undermines programming ease intended by the model. ThereCopyright is held by the author/owner(s). SPAA'06, July 30-August 2, 2006, Cambridge, Massachusetts, USA. ACM 1-59593-452-9/06/0007.
fore, a higher level of abstraction is desirable.Careful evaluation of the issues listed above and our indepth study of Sandia 1 applications suggest that the next appropriate level of abstraction should support high-volume, random, fine-grained parallel data access. Our work has three parts: BEC, a bootstrap approach to add GAS capabilities to MPI; PRAM C, a C language extension to support parallel random access and maximal expression of parallelism in virtual processors; translation, a new scheme that statically compiles fine-grained parallelism into coarsegrained parallelism.Specifically, BEC (Bundle-Exchange-Compute) is an abstraction formalized from well-practiced MPI programming techniques. In dealing with high-volume, fine...