“…The potential to reduce the risks in deploying RL policies is gaining researchers' interest. There are many works on offline RL [1, 2, 6, 8, 11, 14, 15, 20-22, 24, 29, 44-46] and OPE [3,7,9,18,23,30,33,[36][37][38][39][40], and also in their applicability in RecSys practice [5,12,13,25,27,32,34,35,43].…”