“…Exciting advances have been made in designing stable and high-performing empirical offline RL algorithms (Fujimoto et al, 2019;Laroche et al, 2019;Wu et al, 2019;Kumar et al, 2019Kumar et al, , 2020Agarwal et al, 2020;Kidambi et al, 2020;Siegel et al, 2020;Liu et al, 2020;Yang and Nachum, 2021;Yu et al, 2021). On the theoretical front, recent works have proposed efficient algorithms with theoretical guarantees, based on the principle of pessimism in face of uncertainty (Liu et al, 2020;Buckman et al, 2020;Yu et al, 2020;Rashidinejad et al, 2021), or variance reduction (Yin et al, 2020(Yin et al, , 2021. Interesting readers are encouraged to check out these works and the references therein.…”