This paper handles a kind of strategic game called potential games and develops a novel learning algorithm Payoff-based Inhomogeneous Partially Irrational Play (PIPIP). The present algorithm is based on Distributed Inhomogeneous Synchronous Learning (DISL) presented in an existing work but, unlike DISL, PIPIP allows agents to make irrational decisions with a specified probability, i.e. agents can choose an action with a low utility from the past actions stored in the memory. Due to the irrational decisions, we can prove convergence in probability of collective actions to potential function maximizers. Finally, we demonstrate the effectiveness of the present algorithm through experiments on a sensor coverage problem. It is revealed through the demonstration that the present learning algorithm successfully leads agents to around potential function maximizers even in the presence of undesirable Nash equilibria.We also see through the experiment with a moving density function that PIPIP has adaptability to environmental changes.
Index Termspotential game, learning algorithm, cooperative control, multi-agent system Tatsuhiko Goto is with Toshiba Corporation, Takeshi Hatanaka(corresponding author) and Masayuki Fujita are with
In this paper, we investigate the game theoretic coverage control whose objective is to lead agents to optimal configurations over a mission space. In particular, the objective of this paper is to achieve the control objective (i) in the absense of the perfect prior knowledge on importance of each point and (ii) in the presence of the action constraints. For this purpose, we first formulate coverage problems with two different global objective functions as so-called potential games. Then, we present a payoff-based learning algorithm determining actions based only on the past actual outcomes. The feature of the present algorithm is to allow an agent to take an irrational action. We also clarify a relation between a design parameter of the algorithm and the probability which agents take the optimal actions and prove that the probability can be arbitrarily increased. Then, we demonstrate the effectiveness of the present algorithm through experiments on a testbed.
In this paper we consider potential game theoretic attitude coordination. We especially focus on two ordered configurations: "synchronization" and "balanced circular formation". We first show that both problems constitute potential games by employing some global and individual objective functions, and a learning algorithm called Restrictive Spatial Adaptive Play (RSAP) leads robots to the ordered configurations with high probability even in the presence of mobility constraints. We moreover show that the problem also constitutes a group-based potential game and convergence of distribution of actions to stationary one can be accelerated. Finally, the effectiveness of the schemes is demonstrated by simulation and experiments on a testbed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.