Particle filters are a group of algorithms to solve inverse problems through statistical Bayesian methods when the model does not comply with the linear and Gaussian hypothesis. Particle filters are used in domains like data assimilation, probabilistic programming, neural network optimization, localization and navigation. Particle filters estimate the probability distribution of model states by running a large number of model instances, the so called particles. The ability to handle a very large number of particles is critical for high dimensional models. This paper proposes a novel paradigm to run very large ensembles of parallel model instances on supercomputers. The approach combines an elastic and fault tolerant runner/server model minimizing data movements while enabling dynamic load balancing. Particle weights are computed locally on each runner and transmitted when available to a server that normalizes them, resamples new particles based on their weight, and redistributes dynamically the work to runners to react to load imbalance. Our approach relies on a an asynchronously managed distributed particle cache permitting particles to move from one runner to another in the background while particle propagation goes on. This also enables the number of runners to vary during the execution either in reaction to failures and