“…Nowadays, models (pre)-trained on synthetic datasets have a broad range of utility including feature matching (DeTone et al, 2018 ) autonomous driving (Siam et al, 2021 ), robotics indoor and aerial navigation (Nikolenko, 2021a ), scene segmentation (Roberts et al, 2021 ), and anonymized image generation in healthcare (Piacentino et al, 2021 ). The approaches broadly adopt the following process: pre-train with synthetic data before training on real-world scenes (DeTone et al, 2018 ; Hinterstoisser et al, 2019 ), generate composites of synthetic data and real images to create a new one that contains the desired representation (Hinterstoisser et al, 2018 ) or generate realistic datasets using simulation engines like Unity (Borkman et al, 2021 ) or generative models like GANs (Jeon et al, 2021 ; Mustikovela et al, 2021 ). There are limitations to each of these regimes but one of the most common pitfalls is performance deterioration in real-world datasets.…”