The importance of properly optimizing code for execution on super-scalar processors was investigated. Access to the domain specialist was not available during the optimization investigation. For this study of an existing serial FORTRAN application, the use of compiler switches, manual coding techniques, a commercial preprocessor utility (KAP), and a commercial parallelization utility (FORGE) showed the potential to affect execution performance by more than an order of magnitude. The application for the case study was a three-dimensional boundary element code that modeled spherical particle transport phenomena in a particle suspension. Separate experiments were conducted using two different processor platforms: a four node IBM SP (160Mhz POWER2 CPU) and a single node DEC Alpha (667Mhz 21164 CPU).Execution times for the non-optimized, serial base case were 72 hours on a single IBM SP node and 66 hours on the DEC Alpha. Using a combination of compiler switches and manual optimizations, such as in-lining of inefficient subroutines, execution times were reduced to 7.5 hours on a single IBM SP node and 5.4 hours on the DEC Alpha. The use of the KAP pre-processor reduced execution time to 2.3 hours on the single IBM SP node. Using the parallelization software FORGE and four nodes on the IBM SP resulted in an execution time of 25.8 hours without compiler optimization and 3.0 hours using compiler switches for optimization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.