Abstract. Coupled reactive transport simulations are extremely demanding in terms of required computational power, which hampers their application and leads to coarsened and oversimplified domains. The chemical sub-process represents the major bottleneck: its acceleration is an urgent challenge which gathers increasing interdisciplinary interest along with pressing requirements for subsurface utilization such as spent nuclear fuel storage, geothermal energy and CO2 storage. In this context we 5 developed POET (POtsdam rEactive Transport), a research parallel reactive transport simulator integrating algorithmic improvements which decisively speedup coupled simulations. In particular, POET is designed with a master/worker architecture, which ensures computational efficiency on both multicore and cluster compute environments. POET does not rely on contiguous grid partitions for the parallelization of chemistry, but forms work packages composed of grid cells distant from each other. Such scattering prevents particularly expensive geochemical simulations, usually concentrated in the vicinity of a reactive front, from generating load imbalance between the available CPUs, as it is often the case with classical partitions. Furthermore, POET leverages an original implementation of Distributed Hash Table (DHT) mechanism to cache the results of geochemical simulations for further reuse in subsequent time-steps during the coupled simulation. The caching is hence particularly advantageous for initially chemically homogeneous simulations and for smooth reaction fronts. We tune the rounding employed in the DHT on a 2D benchmark to validate the caching approach, and we evaluate the performance gain of POET's master/worker architecture and the DHT speedup on a 3D benchmark comprising around 650 k grid elements. The runtime for 200 coupling iterations, corresponding to 960 simulation days, reduced from about 24 h on 11 workers to 29 minutes on 719 workers. Activating the DHT reduces the runtime further to 2 h and 8 minutes respectively. Only with this kind of reduced hardware requirements and computational costs it is possible to realistically perform the large scale, long-term complex reactive transport simulations, as well as performing the uncertainty analyses required by pressing societal challenges connected with subsurface utilization.