“…Through trial-and-error interactions with the environment, Reinforcement Learning (RL) offers a promising approach to solving decision-making and optimization problems. Over the past few years, RL has accomplished impressive feats in handling difficult tasks, in such domains as autonomous driving [119,16], locomotion control [99,129], robotics [71,94], continuous control [5,6,7], and multi-agent systems and control [39,15]. A majority of these successful approaches are purely data-driven and leverage trial-and-error to freely explore the search space.…”