Abstract. The automatic parallelization of C has always been frustrated by pointer arithmetic, irregular control flow and complicated data aggregation. Each of these problems is similar to familiar challenges encountered in the parallelization of more rigidly-structured languages such as FORTRAN. By creating a mapping from one language to the other, we can expose the capabilities of existing automatically parallelizing compilers to the C language. In this paper, we describe our approach to mapping applications written in C to a form suitable for the Polaris source-tosource FORTRAN compiler. We also describe the improvements in the compiled applications realized by this second level of transformation and show results for a small application in comparison to commercial compilers.
IntroductionPolaris is a automatically parallelizing source-to-source FORTRAN compiler. It accepts FORTRAN77 input and produces a FORTRAN output in a new dialect that supports explicit parallelism by means of embedded directives such as the OpenMP [Ope97] or Sun FORTRAN Directives [Sun96]. The benefit that Polaris provides is in automating the analysis of the loops and array accesses in the application to determine how they can best be expressed to exploit available parallelism. Since FORTRAN naturally constrains the way in which parallelism exists, the analysis is somewhat more straightforward than with other languages. This allows Polaris to perform very complicated interprocedural and global analysis without risk of misinterpretation of programmer intent. Experimental results show that Polaris is able to markedly improve the run-time of applications without additional programmer direction [PVE96,BDE+96]. The expressiveness and low-level memory access primitives of C make it ideally suited for translation into efficient machine language. However, these low-level operations interfere with further optimizations such as parallelization, software pipelining and various types of loop transformations. Much research has been performed in the areas of pointer analysis [CWZ90, DMM98, GH95] and control-flow analysis to attempt to overcome the overhead of the conservative compilation techniques used to ensure correct semantics of execution in the presence of complicated expressions. Beyond correctness, the manner in which data is arranged in memory has major impacts on performance [BAM+96] due to architectural tendencies.We can surmise that one potential way of realizing greater application performance would then be to re-write our programs in FORTRAN to allow a more optimal compilation. Indeed this is often the case with libraries and numerical kernels provided for specific computational purposes. However, this would be inconvenient for most programmers who are used to the conveniences of an expressive language. Furthermore, it would be extraordinarily difficult to re-write programs where