“…Despite traditional RL that the learning is driven by an extrinsic reward signal, intrinsically motivated RL concerns task-agnostic learning (Sontakke et al, 2021b,a). Similar to animals' babies (Touwen et al, 1992), the agent may undergo a developmental period in which it acquires reusable modular skills (Kaplan and Oudeyer, 2003;Weng et al, 2001;Tian et al, 2021), such as curiosity and confidence (Schmidhuber, 1991a;Kompella et al, 2017;Burda et al, 2018;Mirza et al, 2020;Groth et al, 2021;Huang et al, 2022). Another aspect of such general competence is the ability of the agent to remain safe during its learning and deployment period (Garcıa and Fernández, 2015).…”