The recent surge in machine learning augmented turbulence modelling is a promising approach for addressing the limitations of Reynolds-averaged Navier-Stokes (RANS) models. This work presents the development of the first open-source dataset, curated and structured for immediate use in machine learning augmented corrective turbulence closure modelling. The dataset features a variety of RANS simulations with matching direct numerical simulation (DNS) and large-eddy simulation (LES) data. Four turbulence models are selected to form the initial dataset: k-ε, k-ε-ϕt-f, k-ω, and k-ω SST. The dataset consists of 29 cases per turbulence model, for several parametrically sweeping reference DNS/LES cases: periodic hills, square duct, parametric bumps, converging-diverging channel, and a curved backward-facing step. At each of the 895,640 points, various RANS features with DNS/LES labels are available. The feature set includes quantities used in current state-of-the-art models, and additional fields which enable the generation of new feature sets. The dataset reduces effort required to train, test, and benchmark new corrective RANS models. The dataset is available at 10.34740/kaggle/dsv/2637500.