A. Zaafrani scite author profile

In a SIMD or VL1W machine, conceptual synchronizations are accomplished by using a static code schedule that does not require run-time synchronization. The lack of run-time synchronization overhead makes these machines very effective for fine-grain parallelism, but they cannot execute parallel code structures as general as those executed by MIMD architectures, and this limits their utility.In this paper we present a timing analysis that allows a compiler for a MIMD machine to eliminate a large fraction of the run-time synchronization by making efficient use of static code scheduling. Although these techniques can be adapted to be applied to most MIMD machines, this paper centers on the analysis and scheduling for barrier MIMD machines. Barrier MIMDs are asynchronous multiple instruction stream/multiple data stream architectures capable of parallel execution of variable execution-time instructions and arbitrary control flow (e.g., wh i I e loops and calls). However, they also incorporate a special hardware barrier synchronization mechanism that facilitates static scheduling by providing a mechanism which the compiler can use to enforce precise timing constraints. In other words, the compiler tracks relative timing between processors and uses static code scheduling until the timing imprecision becomes too large, at which point the compiler simply inserts a barrier to reduce that timing imprecision to zero (or a small constant).This paper describes new scheduling and b a~-rier placement algorithms for barrier MIMDs that are based loosely on the list scheduling approach employed for VLlWs [Ellis 1985]. In addition, the experimental results from scheduling thousands of synthetic benchmark programs for a parameterized barrier MIMD machine are presented.

show abstract

Partitioning the global space for distributed memory systems

Zaafrani

Ito

1993

View full text Add to dashboard Cite

Partitioning the iteration space can signijcandy affect lhe execution time of a loop. In this paper, we propose an improvement over previous partitioning methods for single loops with unform data dependencies. For distributed memory systems, partitioning each loop separately does not guarantee an efjcient execution of the code because of across loop data dependence. As a result, a global iteration space is formed so that all loops in a program are considered when partitioning the global space.In addition, a new and general form of expressing data dependence called hyperplane dependence is introduced and used in the partitioning. It is a dependence whose source and destination are subspaces (of any dimension) of the global iteration space.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

A. Zaafrani

Parallel Region Execution of Loops with Irregular Dependencies

Static synchronization beyond VLIW

Static scheduling for barrier MIMD architectures

Partitioning the global space for distributed memory systems

Contact Info

Product

Resources

About