We present a port of the numerical relativity code SpEC which is capable of running on NVIDIA GPUs. Since this code must be maintained in parallel with SpEC itself, a primary design consideration is to perform as few explicit code changes as possible. We therefore rely on a hierarchy of automated porting strategies. At the highest level we use TLoops, a C++ library of our design, to automatically emit CUDA code equivalent to tensorial expressions written into C++ source using a syntax similar to analytic calculation. Next, we trace out and cache explicit matrix representations of the numerous linear transformations in the SpEC code, which allows these to be performed on the GPU using pre-existing matrix-multiplication libraries. We port the few remaining important modules by hand. In this paper we detail the specifics of our port, and present benchmarks of it simulating isolated black hole spacetimes on several generations of NVIDIA GPU. arXiv:1804.09101v1 [gr-qc] 24 Apr 2018 GPU-Accelerated Simulations of Isolated Black Holes 2
IntroductionNumerical relativity (NR), the direct numerical integration of the Einstein field equations, is now a mature subfield of computational physics, with stable binary black hole evolutions possible since 2005 [1-5]. Binary black hole NR is of central importance for gravitational waveform modeling. For instance, the SpEC waveform catalogues [6,7] were used in the development of EOB-waveform models [8][9][10]. Waveform models calibrated to numerical relativity were used to analyse the resent BBH gravitational wave events detected by LIGO and Virgo [11,12]. Furthermore, NR binary black hole simulations were used to assess systematic errors of parameter estimation of these GW events [13][14][15].Detailed knowledge of expected waveforms, themselves coming ultimately from NR simulations, are required by these detectors to maximize sensitivities, to interpret observation, and to make tests of general relativity [16]. Ground based detectors' relative insensitivity to e.g. eccentric binaries is, conversely, in part due to a lack of productionquality simulations in the eccentric region of parameter space [17], a situation which also impairs comparisons with analytic theory.The intricacy of the Einstein equations presents two challenges to NR. First, interesting simulations are expensive, with wallclock times measured in weeks or months. Second, codes able to perform such simulations are quite intricate from a software engineering perspective. Applying them to new regions of the binary black hole parameter space -let alone to new spacetimes -can require months of effort by small groups of experts. These issues are difficult to address simultaneously, since improving runtime tends to complicate code, and vice versa.Twenty years ago, the simplest solution would have been to simply wait for hardware to improve. Unfortunately CPU clock frequencies have been essentially static for some time now, with new high-performance computers instead employing increasingly massive levels of parallelism. But...