Abstract. The spectral method with discrete spherical harmonics transform plays an important role in many applications. In spite of its advantages, the spherical harmonics transform has a drawback of high computational complexity, which is determined by that of the associated Legendre transform, and the direct computation requires time of O(N 3 ) for cut-off frequency N . In this paper, we propose a fast approximate algorithm for the associated Legendre transform. Our algorithm evaluates the transform by means of polynomial interpolation accelerated by the Fast Multipole Method (FMM). The divideand-conquer approach with split Legendre functions gives computational complexity O(N 2 log N ). Experimental results show that our algorithm is stable and is faster than the direct computation for N ≥ 511.
Abstract-Power dissipation is one of the most imminent limitation factors influencing the development of High Performance Computing (HPC). Toward power-efficient HPC on CPU-GPU hybrid platform, we are investigating software methodologies to achieve optimized power utilization by algorithm design and programming technique. In this paper we discuss power measurements of GPU, propose a method of automatic extraction of power data of CUDA kernels from long measurement sequence, and execute an exactitude and effective power analysis on CUDA kernels. By using the proposed method above, we measured a sample kernel that performs single precision floating point additions on GeForce 8800 GTS. Our results suggest that the power consumption by a nonworking thread in underoccupied half-warp is 71% of the power consumed by a working thread.
Abstract. The present paper discusses scalable implementations of sparse matrix-vector products, which are crucial for high performance solutions of large-scale linear equations, on a cc-NUMA machine SGI Altix3700. Three storage formats for sparse matrices are evaluated, and scalability is attained by implementations considering the page allocation mechanism of the NUMA machine. Influences of the cache/memory bus architectures on the optimum choice of the storage format are examined, and scalable converters between storage formats shown to facilitate exploitation of storage formats of higher performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.