In this work we repo t on our experiences running OpenMP programs on a commodity cluster of PCs running a softw_,re distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS Parallel Benchmarks that have been automaticaly parall elized for OpenMP. We compare the performance of the OpenMP implementations with that of tac_r message passing counterparts and discuss performance differences. 1 Introduction Computer Architectures usi_g clusters of PCs with commodity networking have become a low cost alternative for high end scientific computing. Currently message passing is the dominating programruing model for such cluster_. The development of a parallel program based on message passing adds a new level of complexity t_, the software engineering process since not only computation, but also the explicit movement of dala hetween the processes must be specified. Shared memory parallel processors (SMP) provide a user friendlier programming model. The use of globally addressable men_oly allows users to exploit parallelism while avoiding the difficulties of explicit data distribution on _arallel machines. Parallelism is commonly achieved by multi-threading the execution of loops. Comi_iler directives to support multithreaded execution of loops are supported on most shared memory pardi,el platforms. In addition, many compilers provide an automatic parallelization feature taking all tt,e burden of code analysis off the user. Efficiency of compiler parallelized code is often limited, since _ thorough dependence analysis is not possible without user information. Alternatively, there are para tetization support tools available which take the tedious work of dependence analysis and generation of directives off the user but allow user guidance for critical parts of the code. An example Of such a tool is CAPO [10]. While shared memory a 'chitectures provide a convenient programming model for the user, their drawback is that they are e_pensive and the scalability of the code may be limited due to poor data "HLRS (High Performance C(, nputing Center, Stuttgart) t NASA employee of Comput, _ Sciences Corporation.
Programming ModelsCurrently message passing md shared address space are the two leading programming models for clusters of SMPs.
Message PassingMessage passing is a well uaderstood programming paradigm. The computational work and the associated data are distributed between a number of processes. If a process needs to access data located in the memory of another 1,rocess, it has to be communicated via the exchange of messages. The data transfer requires cooperative operations to be performed by each process, that is, every send must have a matching recetve. The regular message passing communication achieves two effects: communication of data fron sender to receiver and synchronization of sender with receiver. MPI (Message Passing Interface) [12] is a widely accepted standard for writing message passing programs. It is a standard t,rogramming interface for the c...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.