Abstract. Highly resolved intrinsic geometrical shapes used in three-dimensional parallel simulations of fluid flows consume a large portion of the available memory when loaded serially on every process. This demands for a memory efficient implementation of a distributed geometry which is however a non-trivial task when complex spatial domain decomposition methods for the flow domain are involved. To overcome this problem, an algorithm to generate a parallel geometry during the mesh generation is proposed that enables a low-memory subdivision of the geometry based on the decomposition of the flow field. The applied meshing method generates computational grids that can be used for simulations on a quasi-arbitrary number of cores on which the geometry is distributed in an efficient preprocessing step. This allows reducing the number of instances of the geometry in the global memory of the simulation to about one. The algorithm is used to generate a parallel geometry for a large shape consisting of 7 · 10 6 triangles, i.e., for a geometry representing the whole respiratory tract down to the 12 th lung generation. For this case, performance and memory consumption measurements are given for simulations on 8,192 up to 131,072 cores and juxtaposed to results obtained from simulations using non-parallel geometries. The findings show that with the new method not only the memory usage could be reduced by the factors of 1,802 and 19,936 for core numbers of 8,192 and 131,072 but also a large speed-up factor of about 51 is obtained in the geometry I/O and preprocessing. Furthermore, the parallel geometry allows using the sweet spot with respect to a combination of distributed and shared memory parallelization leading to an increase of the computational speed of about 1.43.