“…[10,11,12]), the majority of works with real robots have focused on approaches that learn locomotion skills in simulation, and then transfer the resulting con-trollers to the hardware [13,14,15,16]. Examples include the traversal of rough terrain with a quadruped [17,18] and robust walking and stair climbing with a biped [19,20,21]. With a sufficiently accurate simulation model (for instance via learned actuator models [13]) and policies that are robust to or can adapt to distribution shift (e.g.…”