“…Based on the game theory, many advanced algorithms have been employed to obtain system equilibriums, such as correlated-Q (CE-Q) [25], asymmetric-Q [26], and their modifications [24]. Author's previous work on the single-agent reinforcement learning (SARL) and MAS-SG has demonstrated that an optimal AGC can be achieved when the agent number is relatively small [27][28][29][30][31][32][33][34][35]. However, multi-equilibrium may emerge as the agent number increases, which inevitably consumes longer time resulted from the extensive online calculation of all system equilibriums, and may even lead to a severe system stability collapse.…”