Sue P. Goudy scite author profile

Sue P. Goudy

4Publications

25Citation Statements Received

47Citation Statements Given

How they've been cited

How they cite others

Affiliations

Sandia National Laboratories, Sandia National Laboratories California

Publications

Order By: Most citations

Implications of application usage characteristics for collective communication offload

Brightwell

Goudy

Rodrigues

et al. 2006

IJHPCN

View full text Add to dashboard Cite

Abstract-The performance of collective communication operations is known to have a significant impact on the scalability of some applications. Indeed, the global, synchronous nature of some collective operations directly implies that they will become the bottleneck when scaling to hundreds of thousands of nodes. This fact has led many researchers to try to improve the efficiency of collective operations. One popular approach improves the implementation of MPI collective operations by using intelligent or programmable network interfaces to offload the burden of communication activities from the host processor(s). Such implementations have shown significant improvement for microbenchmarks that isolate collective communication performance, but these results have not been shown to translate to significant increases in performance for real applications. In order for collective offload implementations to benefit real applications, a greater understanding of application behavior is needed. In this paper, we describe several characteristics of applications and application benchmarks that impact collective communication performance. We analyze network resource usage data in order to guide the design of collective offload engines and their associated programming interfaces. In particular, we provide an analysis of the potential benefit of non-blocking collective communication operations for MPI.

show abstract

A Preliminary Analysis of the MPI Queue Characteristics of Several Applications

Brightwell

Goudy

Underwood

View full text Add to dashboard Cite

Understanding the message passing behavior and network resource usage of distributed-memory messagepassing parallel applications is critical to achieving high performance and scalability. While much research has focused on how applications use critical compute related resources, relatively little attention has been devoted to characterizing the usage of network resources, specifically those needed by the network interface. This paper discusses the importance of understanding network interface resource usage requirements for parallel applications and describes an initial attempt to gather network resource usage data for several real-world codes. The results show widely varying usage patterns between processes in the same parallel job and indicate that resource requirements can change dramatically as process counts increase and input data changes. This suggests that general network resource management strategies may not be widely applicable, and that adaptive strategies or more fine-grained controls may be necessary for environments where network interface resources are severely constrained.

show abstract

An envolutionary path towards virtual shared memory with random access

Brown

Goudy

Heroux

et al. 2006

View full text Add to dashboard Cite

We are developing parallel programming models that are complementary to related projects and respond to unaddressed needs in the parallel computing community. These needs include incremental or partial migration of applications and their expert programmers from MPI, and efficient support for high-volume, random, fine-grained parallelism.A programming model provides an abstraction for expression of parallelism in applications. This abstraction must be at an appropriate level such that inherent parallelism can be mapped to capabilities of the underlying hardware. MPI is the de facto standard for high performance computing, mainly because its abstraction perfectly matches distributed memory architectures. However, it is difficult to directly express certain types of parallelism, such as parallel graph algorithms. Meanwhile, PetaFLOP-scale hardware is approaching. Vendors are developing multi-core processors. MPI may not be a suitable programming model for these new architectectures.GAS models are more expressive than MPI. These models can be realized as libraries (such as SHMEM, MPI-2, Portals) that are callable from conventional languages, or as language extensions (such as UPC, Co-Array Fortran). Existing GAS models typically support one of two levels of abstraction: one-sided communication, which allows a processor to access another processor's memory without the remote processor's cooperation, or distributed shared memory, which provides a logically global view of the data. Accesses to shared data in other processors require communication, which is more expensive than access to local data. Too much fine-grained communication can cause significant performance penalty due to communication latency in each separate transaction. For good performance, users need to manage data locality carefully to minimize fine-grained communication. Without system level support, the task of data locality management can diminish the convenience intended by this programming model. Offering ad hoc support for random communication patterns is not enough; it leads to a large, ever-increasing number of such utilities, and again undermines programming ease intended by the model. ThereCopyright is held by the author/owner(s). SPAA'06, July 30-August 2, 2006, Cambridge, Massachusetts, USA. ACM 1-59593-452-9/06/0007. fore, a higher level of abstraction is desirable.Careful evaluation of the issues listed above and our indepth study of Sandia 1 applications suggest that the next appropriate level of abstraction should support high-volume, random, fine-grained parallel data access. Our work has three parts: BEC, a bootstrap approach to add GAS capabilities to MPI; PRAM C, a C language extension to support parallel random access and maximal expression of parallelism in virtual processors; translation, a new scheme that statically compiles fine-grained parallelism into coarsegrained parallelism.Specifically, BEC (Bundle-Exchange-Compute) is an abstraction formalized from well-practiced MPI programming techniques. In dealing with high-volume, fine...

show abstract

DISCOM2: Distance Computing the SP2 Pilot FY98 Report

Beiriger¹,

Byers²,

Ernest³

et al. 1999

View full text Add to dashboard Cite

Sandla is a multi rogram laboratory operated by2403 San Mateo NE, Albuquerque, New Mexico 871 10 AbstractThe items discussed in this report reflect the work in progress during FY98.As a way to bootstrap the DISCOM' Distance Computing Program the SP2 Pilot Project was launched in March 1998. The Pilot was directed towards creating an environment to allow Sandia users to run their applications on the Accelerated Strategic Computing Initiative's (ASCI) Blue Pacific computation platform, the unclassified IBM SP2 platform at Lawrence Livermore National Laboratory (LLNL). The DISCOM' Pilot leverages the ASCI PSE (Problem Solving Environment) efforts in networking and services to baseline the performance of the current system. Efforts in the following areas of the pilot are documented applications, services, networking, visualization, and the system model. It details not only the running of two Sandia codes CTH and COYOTE on the Blue Pacific platform, but also the building of the Sandia National Laboratories (SNL) proxy environment of the RS6000 platforms to support the Sandia users.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sue P. Goudy

Implications of application usage characteristics for collective communication offload

A Preliminary Analysis of the MPI Queue Characteristics of Several Applications

An envolutionary path towards virtual shared memory with random access

DISCOM2: Distance Computing the SP2 Pilot FY98 Report

Contact Info

Product

Resources

About