Abstract. This paper addresses a principal problem of in vivo evolution of modular multi-cellular robots, where robot 'babies' can be produced with arbitrary shapes and sizes. In such a system we need a generic learning mechanism that enables newborn morphologies to obtain a suitable gait quickly after 'birth'. In this study we investigate and compare the reinforcement learning method RL PoWeR with HyperNEAT. We conduct simulation experiments using robot morphologies with different size and complexity. The experiments give insights into the differences in solution quality and algorithm efficiency, suggesting that reinforcement learning is the preferred option for this online learning problem.