“…Modern RL Over the last decade, a series of methodological advances have given rise to a new generation of RL methods that can tackle complex real-world problems (Mnih Silver et al, 2016;Barozet et al, 2020;Lee et al, 2020), and that have also been successfully applied to DAC. In particular, modern RL methods based on deep neural networks can effectively learn useful representations that allow them to handle complex state and action spaces, using, e.g., (double) deep Q-learning (DDQN, Hansen, 2016;Sharma et al, 2019;Speck et al, 2021;Bhatia et al, 2021), modern actor critic (Ichnowski et al, 2021), and policy gradient methods (Daniel et al, 2016;Xu et al, 2019;Gomoluch et al, 2019;Shala et al, 2020;Getzelman & Balaprakash, 2021;Almeida et al, 2021).…”