Abstract. Many applications require the numerical solution of a partial differential equation (PDE), leading to large and sparse linear systems. Often a multigrid method can solve these systems efficiently. To adapt a multigrid method to a given problem, local Fourier analysis (LFA) can be used. It provides quantitative predictions about the behavior of the components of a multigrid method. In this paper we generalize LFA to handle what we call periodic stencils. An operator given by a periodic stencil has a block Fourier symbol representation. It gives a way to compute the spectral radius and norm of the operator. Furthermore block Fourier symbols can be used to find out how an operator acts on smooth/oscillatory input and whether its output will be smooth/oscillatory. This information can then be used to construct efficient smoothers and coarse grid corrections. We consider a particular PDE with jumping coefficients and show that it leads to a periodic stencil. LFA shows that the Jacobi method is a suitable smoother for this problem and an operator dependent interpolation is better than linear interpolation, as suggested by numerical experiments described in the literature. If an operator is given by an ordinary stencil, then block smoothers yield periodic stencils if the blocks correspond to rectangles in the domain. LFA shows that the block Jacobi and the red-black block Jacobi method efficiently reduce more frequencies than their pointwise versions. Further, it yields that a block smoother used in combination with aggressive coarsening can to some degree compensate for the reduced convergence rate caused by aggressive coarsening.
Abstract. Project ExaStencils pursues a radically new approach to stencil-code engineering. Present-day stencil codes are implemented in general-purpose programming languages, such as Fortran, C, or Java, or derivates thereof, and harnesses for parallelism, such as OpenMP, OpenCL or MPI. ExaStencils favors a much more domain-specific approach with languages at several layers of abstraction, the most abstract being the mathematical formulation, the most concrete the optimized target code. At every layer, the corresponding language expresses not only computational directives but also domain knowledge of the problem and platform to be leveraged for optimization. This approach will enable a highly automated code generation at all layers and has been demonstrated successfully before in the U.S. projects FFTW and SPIRAL for certain linear transforms. The Challenges of Exascale ComputingThe performance of supercomputers is on the way from petascale to exascale. Software technology for high-performance computing has been struggling to keep up with the advances in computing power, from terascale in 1996 to petascale in 2009 on to exascale, now being only a factor of 30 away and predicted for the end of the present decade. So far, traditional host languages, such as Fortran and C, being equipped with harnesses for parallelism, such as MPI and OpenMP, have taken most of the burden, and they are being developed further with some new abstractions, notably the partitioned global address space (PGAS) memory model [1] [10]. Yet, the sequential host languages remain generalpurpose: Fortran or C or, if object orientation is desired, C ++ or Java.The step from petascale to exascale performance challenges present-day software technology much more than the advances from gigascale to terascale and terascale to petascale have. The reason is the explicit treatment of the massive parallelism inside one node of a high-performance cluster cannot be avoided any longer. That is, the cluster nodes must be manycores with high numbers of cores. The reorientation of the computer market from single cores to multicores and manycores has been observed with concern [29]. In the high-performance market, the situation is somewhat alleviated by the fact that the additional cycles that large numbers of cores provide are actually being yearned for. But, the question of how to exploit them with efficient and robust software remains.While the potential for massive parallelism on and off the chip is the single most serious challenge to exascale software technology, other challenges take on a high priority and are frequently being mentioned, such as power conservation, fault tolerance and heterogeneity of the execution platform [2]. At best, one would strive for performance portability, i.e., the ability to switch the software with ease from one platform, when it is being decommissioned, to the next, while maintaining highest performance. ExaStencils Application Domain: Stencil CodesStencil codes have extremely high significance and value for a good-sized c...
Deflation techniques for Krylov subspace methods have seen a lot of attention in recent years. They provide means to improve the convergence speed of these methods by enriching the Krylov subspace with a deflation subspace. The most common approach for the construction of deflation subspaces is to use (approximate) eigenvectors, but also more general subspaces are applicable.In this paper we discuss two results concerning the accuracy requirements within the deflated CG method. First we show that the effective condition number which bounds the convergence rate of the deflated conjugate gradient method depends asymptotically linearly on the size of the perturbations in the deflation subspace. Second, we discuss the accuracy required in calculating the deflating projection. This is crucial concerning the overall convergence of the method, and also allows to save some computational work.To show these results, we use the fact that as a projection approach deflation has many similarities to multigrid methods. In particular, recent results relate the spectra of the deflated matrix to the spectra of the error propagator of twogrid methods. In the spirit of these results we show that the effective condition number can be bounded by the constant of a weak approximation property.
Abstract. The Lanczos process constructs a sequence of orthonormal vectors vm spanning a nested sequence of Krylov subspaces generated by a hermitian matrix A and some starting vector b. In this paper we show how to cheaply recover a secondary Lanczos process starting at an arbitrary Lanczos vector vm. This secondary process is then used to efficiently obtain computable error estimates and error bounds for the Lanczos approximations to the action of a rational matrix function on a vector. This includes, as a special case, the Lanczos approximation to the solution of a linear system Ax = b. Our approach uses the relation between the Lanczos process and quadrature as developed by Golub and Meurant. It is different from methods known so far because of its use of the secondary Lanczos process. With our approach, it is now in particular possible to efficiently obtain upper bounds for the error in the 2-norm, provided a lower bound on the smallest eigenvalue of A is known. This holds in particular for a large class of rational matrix functions including best rational approximations to the inverse square root and the sign function. We compare our approach to other existing error estimates and bounds known from the literature and include results of several numerical experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.