“…In recent years, considerable research has been devoted to the problems of decision-making and interaction of AV in intersection scenarios. These studies have employed various approaches, including ruled-based methods [12], gametheoretic methods [5], [13], and data-driven techniques [8], [9], [14], in which RL is recognized as a flexible, efficient, and potent method. However, its widespread implementation is hindered by several obstacles, one of which is training a RL model that can effectively manage a range of driving situations and decision-making tasks [8].…”