In this work we consider the numerical computation of eigenpairs of integral operators associated to a radiative transfer problem. The target problem models the emission of photons in stellar atmospheres and is formulated in terms of a weakly singular Fredholm integral equation defined on a Banach space. We examine parallel, direct and iterative strategies for the eigensolution phase.
Description of the problemIn this contribution we focus on the numerical computation of eigenpairs of integral operators associated to a model for the emission of photons in stellar atmospheres; see [1][2] for details of the problem. In a previous work we have investigated numerical techniques for the solution of the associated radiative transfer equation [3], considering parallel distributed processing. Here we apply direct and iterative strategies that are available in open source software packages to investigate the spectral properties of the problem. Specifically, we use packages available in the ACTS Collection of the US Department of Energy (DOE) [4].Basically, we seek to solve T ϕ = λϕ where T is an integral operator defined inwhere τ is the optical depth of the stellar atmosphere, τ * is the optical thickness of the atmosphere, ∈ [0, 1] is the albedo, and
Solution strategyOur goal is to approximate T m ϕ m = λ m ϕ m by solving an associated eigenvalue problem Ax = λx for large values of m, where A is nonsymmetric, band and sparse. In order to achieve this, we parallelized the generation of A (using MPI) and experimented with two numerical strategies for the eigensolution phase: direct methods implemented in the ScaLAPACK library to compute the Schur decomposition of A, and iterative methods implemented in the SLEPc library to compute a few eigenpairs of A. In ScaLAPACK we first need to reduce A to Hessenberg form (subroutine pdgehrd) and then compute the Schur decomposition of the Hessenberg matrix (subroutine pdlahqr). Concerning SLEPc, we used a recently implemented version of the Krylov-Schur algorithm, which has proven to be faster and more accurate than the Arnoldi algorithm in a number of cases. Our tests were performed on three machines: Jacquard and Bassi, located at the National Energy Research Center (NERSC) of the US DOE, and GridUP, located at the University of Porto. Jacquard is an AMD Opteron cluster with 356 dual-processor nodes, 2.2 GHz processors, 6 GB of memory per node, interconnected with a high-speed InfiniBand network. Bassi is an IBM p575 POWER 5 system with 122 8-processor nodes and 32 GB of memory per node. GridUP is an AMD Opteron Cluster with 24 dual-processor nodes, 2.4 GHz processors, 4 GB of memory per node, interconnected with Gigabit network. On Jacquard and on GridUP we used the basic linear algebra subroutines (BLAS) available in the ACML library, and on Bassi we used the BLAS available in the ESSL library. For all cases showed here we used = 0.75 (tests with larger values of required similar computing times). We used a relative error ε ≤ 10 −12 for the iterative method in order to obtain ...