H.J. Wong scite author profile

In this work we present a microbenchmark methodology for assessing the overheads associated with nested parallelism in OpenMP. Our techniques are based on extensions to the well known EPCC microbenchmark suite that allow measuring the overheads of OpenMP constructs when they are effected in inner levels of parallelism. The methodology is simple but powerful enough and has enabled us to gain interesting insight into problems related to implementing and supporting nested parallelism. We measure and compare a number of commercial and freeware compilation systems. Our general conclusion is that while nested parallelism is fortunately supported by many current implementations, the performance of this support is rather problematic. There seem to exist issues which have not yet been addressed effectively, as most OpenMP systems do not exhibit a graceful reaction when made to execute inner levels of concurrency.

show abstract

The design of MPI based distributed shared memory systems to support OpenMP on clusters

Wong

Rendell

2007

View full text Add to dashboard Cite

OpenMP can be supported in cluster environments by using distributed shared memory (DSM) systems. A portable approach for building DSM systems is to layer it on MPI. With these goals in mind, this paper makes two contributions. The first is a discussion about two software DSM systems that we have implemented using MPI. One uses background polling threads while the other uses processes that are driven only by incoming MPI messages. Comparisons of the two approaches show the latter to be a more scalable architecture that is better suited for the multi-core processors that are becoming commonplace. The second contribution recognizes that a common workaround for sub-team synchronizations in OpenMP is to use the flush directive on shared variables within busy-wait loops. In such a situation, only the flush in the last iteration of the busy-wait loop will result in the conditions necessary for exiting the loop. Thus transfer of the shared value need only be done if there were changes. We implement in our DSM a flush mechanism that eliminates the unnecessary data transfers entirely without any additional support or hints from the programmer.

show abstract

Integrating software distributed shared memory and message passing programming

Wong

Rendell

2009

View full text Add to dashboard Cite

The SCore Cluster Enabled OpenMP Environment: Performance Prospects for Computational Science

Wong

Rendell

2005

View full text Add to dashboard Cite

The OpenMP shared memory programming paradigm has been widely embraced by the computational science community, as has distributed memory clusters. What are the prospects for running OpenMP applications on clusters? This paper gives an overview of the SCore cluster enabled OpenMP environment, provides performance data for some of the fundamental underlying operations, and reports overall performance for a model computational science application (the finite difference solution of the 2D Laplace equation).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

H.J. Wong

Performance models for Cluster-enabled OpenMP implementations

Micro-benchmarks for Cluster OpenMP Implementations: Memory Consistency Costs

The design of MPI based distributed shared memory systems to support OpenMP on clusters

Integrating software distributed shared memory and message passing programming

The SCore Cluster Enabled OpenMP Environment: Performance Prospects for Computational Science

Contact Info

Product

Resources

About