“…Ashby, for example, in his concept of ultrastability (Ashby, 1960), formulated perhaps the first mechanistic account of open-ended learning, namely as the random exploration of a large space of sensorimotor loops with the aim of achieving homeostatic equilibrium (for use of this idea in more recent work also see Di Paolo, 2000, 2003, 2010; Harvey et al, 2005; Iizuka and Di Paolo, 2007, 2008; Di Paolo and Iizuka, 2008; Manicka and Di Paolo, 2009; Izquierdo et al, 2013). Parallels with reinforcement learning (Sutton and Barto, 2009) and related sensorimotor approaches (e.g., Duff et al, 2011; Maye and Engel, 2011, 2013) can be drawn as well. For instance, the exploration-exploitation trade-off characteristic of such approaches is related to the assimilation-accommodation dichotomy in equilibration; and the global equilibrium towards which these systems tend is one of maximum expected reward, in analogy with the state of maximum equilibration.…”