The estimation of optical flow is an ambiguous task due to the lack of correspondence at occlusions, shadows, reflections, lack of texture and changes in illumination over time. Thus, unsupervised methods face major challenges as they need to tune complex cost functions with several terms designed to handle each of these sources of ambiguity. In contrast, supervised methods avoid these challenges altogether by relying on explicit ground truth optical flow obtained directly from synthetic or real data. In the case of synthetic data, the ground truth provides an exact and explicit description of what optical flow to assign to a given scene. However, the domain gap between synthetic data and real data often limits the ability of a trained network to generalize. In the case of real data, the ground truth is obtained through multiple sensors and additional data processing, which might introduce persistent errors and contaminate it. As a solution to these issues, we introduce a novel method to build a training set of pseudo-real images that can be used to train optical flow in a supervised manner. Our dataset uses two unpaired frames from real data and creates pairs of frames by simulating random warps, occlusions with super-pixels, shadows and illumination changes, and associates them to their corresponding exact optical flow. We thus obtain the benefit of directly training on real data while having access to an exact ground truth. Training with our datasets on the Sintel and KITTI benchmarks is straightforward and yields models on par or with state of the art performance compared to much more sophisticated training approaches.