“…Consequently, our approach goes beyond prior policy distillation approaches to handle scenarios where supervision by the teacher model may be potentially noisy and unsafe. We also note the relationship between such distillation and semi-supervised training via pseudo-labeling [8,40,45,62,64]. However, as far as we are aware, we are the first to develop a pseudo-labeling based self-training method for learning safe driving policies from complex scenes with diverse navigation data, camera perspectives, geographical locations, and weathers.…”