Several analysis tasks feature the propagation of large sets of trajectories, ranging from monitoring the space debris environment to assessing the compliance of space missions with planetary protection policies. The increasingly stringent accuracy requirements inevitably make high-fidelity analyses become more computationally intensive, easily reaching the need of propagating hundreds of thousands of trajectories for one single task. For this reason, high-performance computing (HPC) and GPU (Graphics Processing Unit) computing techniques become one of the enabling technologies that allow the execution of this kind of analyses. The latter, given its accessible cost and hardware implementation, has increasingly been adopted in the past decade: more modern and powerful graphic cards are launched in the market every year, and new GPU-dedicated algorithms are continuously built and adopted: for reference, GPUs are the technology which the training of the most known artificial intelligence models is made upon. Reference tools for the astrodynamics community are represented by SNAPPshot [1] and CUDAjectory [2]: they both aim at achieving efficient propagations, the former being a CPU-based software suited for planetary protection analyses, the latter being a high-fidelity and efficiency GPU ballistic propagator. Both software work on a traditional step-based logic, that takes initial states and studies their step-by-step evolution in time.
The proposed work builds on previously obtained results [3], proposing an alternative algorithm logic specifically designed for HPC and GPU computing, in order to extract all the possible performance from these computational architectures. In contrast to traditional, step-based, numerical schemes, the Picard-Chebyshev (PC) method starts the integration process from samples of a trajectory guess, which are iteratively updated until the supplied dynamical model is matched. The core of this numerical scheme is, other than the evaluation of the dynamics function, a sequence of matrix multiplications: this feature makes the method, in principle, highly suitable for parallel and GPU computing. However, the limited number of trajectory nodes required to reach high accuracy levels (100-200) hinders the parallel efficiency of the algorithm. In other words, the parallel overhead outweighs the possible acceleration, for systems this small. In [3], an augmented version of the basic Picard-Chebyshev simulation scheme for the propagation of large sets of trajectories is proposed. Instead of integrating, either sequentially or in parallel, each trajectory individually, an augmented dynamical system collecting all the samples is built and fed to the PC scheme. This approach outperforms the individual simulations in any parallelisation case, and its GPU implementation is observed to run faster already on low-end graphics cards, compared to a 40-core CPU cluster.
This work introduces and implements the latest updated version of the PC scheme, which features iteration error feedback and second-order dynamics adaptability for improved iteration efficiency [4]. These adaptations contribute to reduce the computational time by a factor four, because of the reduced number of iterations required to converge. In addition, the algorithm is adapted to the newest generation NVIDIA graphics card, also exploiting the novel Tensor Core architecture for double precision computation, building an updated GPU software that overall is 50-100 times faster than its original version.
Finally, an interface for the proposed scheme for regularised formulations (e.g., [5]) is proposed, aiming at improving the software robustness in tackling near-singular and sets of divergent trajectories. Performance and accuracy comparisons, in terms of number of trajectory samples required by the PC scheme, against the standard Cartesian propagation case are presented. Regularized formulations require a lower amount of trajectory samples to reach a given relative error threshold, compared to the Cartesian case, resulting in turn to a notable decrease in computational runtime. These improved software capabilities are tested in several critical case scenarios, proposing a complete analysis of close encounters, encompassing deep, shallow, and impacting flybys, in the Circular Restricted Three Body problem. Here, a further advantage of regularized formulations comes into play: the impact singularity featuring the gravitational model is removed by construction, making it feasible to treat impacting trajectories and shallow encounters in a single common augmented propagation.
[1]Colombo C., Letizia F., Van Der Eynde J., “SNAPPshot ESA planetary protection compliance verification software Final report V1.0, Technical Report ESA-IPL-POM-MB-LE-2015- 315,” University of Southampton, Tech. Rep., 2016
[2]Geda M., Noomen R., Renk F., “Massive Parallelization of Trajectory Propagations using GPUs”, 2019, Master’s thesis, Delft University of Technology, http://resolver.tudelft.nl/uuid:1db3f2d1-c2bb-4188-bd1e-dac67bfd9dab
[3]Masat A., Colombo C., Boutonnet A., “GPU-based high-precision orbital propagation of large sets of initial conditions through Picard-Chebyshev augmentation”, 2023, Acta Astronautica, https://doi.org/10.1016/j.actaastro.2022.12.037
[4]Woollands R., Junkins J. L., “Nonlinear differential equation solvers via adaptive Picard-Chebyhsev iteration: application in astrodynamics”, 2019, Journal of Guidance, Control, and Dynamics, https://doi.org/10.2514/1.G003318
[5]Masat A., Colombo C., “Kustaanheimo-Stiefel variables for planetary protection compliance analysis”, 2022, Journal of Guidance, Control, and Dynamics, https://doi.org/10.2514/1.G006255