The Rock-Paper-Scissors game is a popular zero-sum game of cyclic nature, with a mixed-strategy Nash-equilibrium that has been the subject of a large number of studies and is of particular interest for economy, sociology and artificial intelligence. While there are numerous studies exploring evolutionary dynamics and learning, the overwhelming majority of these consider the game in its classical form, and two important axes with potential relevance remain unexplored. First, studies with policy-based reinforcement algorithms are lacking, and second, few existing investigations attempted to study such cyclic games with more than two players. The present work aims to address both of these matters.