Abstract. GPUs excel at solving many parallel problems and hence dramatically increase the computation performance. In electrodynamics and many other fields, FDTD method is widely used due to its simplicity, accuracy, and practicability. In this paper, we applied the FDTD method on the Fermi Architecture GPUs, the latest product of NVidia, for a better understanding of Fermi's new features, such as the double precision support and improved memory hierarchy. Then we make a comparison between the strategies using the shared memory, the traditional optimization method on GPUs, and using L1 cache. Next, the paper provides insights into the disparity of these two strategies. We demonstrate that parallel computations only using L1 cache can reach the similar or even better performance as the traditional optimization method using the shared memory does when the dataset is not too large or the frequency of repeated use of the related data is low.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.