2012
DOI: 10.1007/978-3-642-29737-3_40
|View full text |Cite
|
Sign up to set email alerts
|

Spherical Harmonic Transform with GPUs

Abstract: Abstract.We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, s 2 hat. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and cont… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
22
0

Year Published

2013
2013
2017
2017

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 16 publications
(24 citation statements)
references
References 12 publications
2
22
0
Order By: Relevance
“…The major bottleneck of the code performance is due to the need of calculating a single inverse spherical harmonic transform which is required to obtain the overpixelized map of the unlensed signal. This can certainly be alleviated further by using better algorithms and/or numerical implementations, e.g., capitalizing on hardware accelerators such as GPGPU (Hupca et al 2012;Szydlarski et al 2011;Fabbian et al 2012;Reinecke & Seljebotn 2013). We leave these code optimizations for future work.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The major bottleneck of the code performance is due to the need of calculating a single inverse spherical harmonic transform which is required to obtain the overpixelized map of the unlensed signal. This can certainly be alleviated further by using better algorithms and/or numerical implementations, e.g., capitalizing on hardware accelerators such as GPGPU (Hupca et al 2012;Szydlarski et al 2011;Fabbian et al 2012;Reinecke & Seljebotn 2013). We leave these code optimizations for future work.…”
Section: Discussionmentioning
confidence: 99%
“…The core of the library is written in F90 with a C interface and it uses the message passing interface (MPI) to institute distributed memory communication, which ensures its portability. The latest release of the library also includes routines suitable for general purpose graphic processing units (GPGPUs) coded in CUDA (Hupca et al 2012;Szydlarski et al 2011;Fabbian et al 2012).…”
Section: Spherical Harmonic Transformsmentioning
confidence: 99%
“…Averaging over these circles enables us to compile datasets of higher signal-to-noise ratio. It has been shown by Janssen et al (1996) that the low frequency noise can be represented by a uniform offset on a given baseline, corresponding in the case of Planck to a stable pointing period called a ring. However, it can also be adapted to ground-based or balloon-borne experiments (Patanchon et al 2008;Sutton et al 2010).…”
Section: Destriping Methodsmentioning
confidence: 99%
“…In the case of azimuthally symmetric patches, the numerical computation of such Fisher matrices (either F or F rr ), can be performed in a reasonable time using the expression found in the appendix F of Ref. [22], and by using the s 2 hat package to perform spherical harmonic transforms [41][42][43][44]. (The use of this massively parallel package allows for a rapid computation of the covariance matrix for large sky coverages.).…”
Section: Minimum Variance Quadratic Estimatormentioning
confidence: 99%