“…Even though nowadays, most autonomous driving (AD) stacks [30,48] use individual modules for perception, planning and control, end-to-end approaches have been proposed since the 80's [35] and the success of deep learning brought them back into the research spotlight [5,50]. Numerous works have studied different network architectures for this task [3,16,52], yet most of these approaches use supervised learning with expert demonstrations, which is known to suffer from covariate shift [36,40]. While data augmentation based on view synthesis [2,5,35] can partially alleviate this issue, in this paper, we tackle the problem from the perspective of expert demonstrations.…”