“…Recent works in reinforcement learning have developed theoretical tools to break down complexity by operating a move from considering many agents to a collection of single agents, each of which being optimized separately (Dibangoye et al, 2015), leading to theoretically well-founded contributions, but with limited practical validation involving very few robots and simple tasks . Lacking theoretical foundations, but instead based on the experimental validation, swarm robotics controllers have been developed with black-box optimization methods ranging from brute-force optimization using a simplified (hence tractable) representation of a problem (Werfel et al, 2014) and evolutionary robotics (Hauert et al, 2008;Trianni et al, 2008;Gauci et al, 2012;Silva et al, 2016).…”