Computational fluid and particle dynamics (CFPD) simulations are of paramount importance for studying and improving drug effectiveness. Computational requirements of CFPD codes demand high-performance computing (HPC) resources. For these reasons, we introduce and evaluate in this article system software techniques for improving performance and tolerating load imbalance on a state-of-the-art production CFPD code. We demonstrate benefits of these techniques on Intel-, IBM- and Arm-based HPC technologies ranked in the Top500 supercomputers, showing the importance of using mechanisms applied at runtime to improve the performance independently of the underlying architecture. We run a real CFPD simulation of particle tracking on the human respiratory system, showing performance improvements of up to 2×, across different architectures, while applying runtime techniques and keeping constant the computational resources.