“…From there, the use of synthetic visual data generated from virtual environments has kept growing. We found works using synthetic data for object detection/recognition [66][67][68][69], object viewpoint recognition [70], re-identification [71], and human pose estimation [72]; building synthetic cities for autonomous driving tasks such as semantic segmentation [44,73], place recognition [74], object tracking [45,75], object detection [76,77], stixel computation [78], and benchmarking different on-board computer vision tasks [47]; building indoor scenes for semantic segmentation [79], as well as normal and depth estimation [80]; generating GT for optical flow, scene flow, and disparity [81,82]; generating augmented reality images to support object detection [83]; simulating adverse atmospheric conditions such as rain or fog [84,85]; even performing procedural generation of videos for human action recognition [86,87]. Moreover, since robotics and autonomous driving rely on sensorimotor models worthy of being trained and tested dynamically, in the last years, the use of simulators has been intensified beyond datasets [48,49,88,89].…”