Abstract. Project ExaStencils pursues a radically new approach to stencil-code engineering. Present-day stencil codes are implemented in general-purpose programming languages, such as Fortran, C, or Java, or derivates thereof, and harnesses for parallelism, such as OpenMP, OpenCL or MPI. ExaStencils favors a much more domain-specific approach with languages at several layers of abstraction, the most abstract being the mathematical formulation, the most concrete the optimized target code. At every layer, the corresponding language expresses not only computational directives but also domain knowledge of the problem and platform to be leveraged for optimization. This approach will enable a highly automated code generation at all layers and has been demonstrated successfully before in the U.S. projects FFTW and SPIRAL for certain linear transforms. The Challenges of Exascale ComputingThe performance of supercomputers is on the way from petascale to exascale. Software technology for high-performance computing has been struggling to keep up with the advances in computing power, from terascale in 1996 to petascale in 2009 on to exascale, now being only a factor of 30 away and predicted for the end of the present decade. So far, traditional host languages, such as Fortran and C, being equipped with harnesses for parallelism, such as MPI and OpenMP, have taken most of the burden, and they are being developed further with some new abstractions, notably the partitioned global address space (PGAS) memory model [1] [10]. Yet, the sequential host languages remain generalpurpose: Fortran or C or, if object orientation is desired, C ++ or Java.The step from petascale to exascale performance challenges present-day software technology much more than the advances from gigascale to terascale and terascale to petascale have. The reason is the explicit treatment of the massive parallelism inside one node of a high-performance cluster cannot be avoided any longer. That is, the cluster nodes must be manycores with high numbers of cores. The reorientation of the computer market from single cores to multicores and manycores has been observed with concern [29]. In the high-performance market, the situation is somewhat alleviated by the fact that the additional cycles that large numbers of cores provide are actually being yearned for. But, the question of how to exploit them with efficient and robust software remains.While the potential for massive parallelism on and off the chip is the single most serious challenge to exascale software technology, other challenges take on a high priority and are frequently being mentioned, such as power conservation, fault tolerance and heterogeneity of the execution platform [2]. At best, one would strive for performance portability, i.e., the ability to switch the software with ease from one platform, when it is being decommissioned, to the next, while maintaining highest performance. ExaStencils Application Domain: Stencil CodesStencil codes have extremely high significance and value for a good-sized c...
Performance optimization of stencil codes requires data locality improvements. The polyhedron model for loop transformation is well suited for such optimizations with established techniques, such as the PLuTo algorithm and diamond tiling. However, in the domain of our project ExaStencils, stencil codes, it fails to yield optimal results. As an alternative, we propose a new, optimized, multi-dimensional polyhedral search space exploration and demonstrate its effectiveness: we obtain better results than existing approaches in several cases. We also propose how to specialize the search for the domain of stencil codes, which dramatically reduces the exploration effort without significantly impairing performance.
Performance optimizations should focus not only on the computations of an application, but also on the internal data layout. A well-known problem is whether a struct of arrays or an array of structs results in a higher performance for a particular application. Even though the switch from the one to the other is fairly simple to implement, testing both transformations can become laborious and error-prone. Additionally, there are more complex data layout transformations, such as a color splitting for multi-color kernels in the domain of stencil codes, that are manually difficult. As a remedy, we propose new flexible layout transformation statements for our domain-specific language ExaSlang that support arbitrary affine transformations. Since our code generator applies them automatically to the generated code, these statements enable the simple adaptation of the data layout without the need for any other modifications of the application code. This constitutes a big advance in the ease of testing and evaluating different memory layout schemes in order to identify the best.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.