From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning

Wang, Weixun; Yang, Tianpei; Liu, Yong; Hao, Jianye; Hao, Xiaotian; Hu, Youyou; Chen, Yingfeng; Fan, Changjie; Gao, Yang

doi:10.1609/aaai.v34i05.6221

Cited by 74 publications

(40 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Reusing replay buffer and policy distillation are the prevalent auxiliary training methods. [29] improves the efficiency of value-based MATL by reusing the transition data generated in previous scenarios. Inspired by policy distillation [22], Liu et al [16] proposes to transfer the knowledge learned in a single agent to multiple agents and uses the n-step return to approximate the difference of the local environment dynamics.…”

Section: Baseline Performancementioning

confidence: 99%

“…Existing CTDE research covers important topics such as division of agents [27], diversification [32] and exploration [19]. Recent works [29,11,1,17,16] have also started to make progress in transfer learning in cooperative MARL. For example, Liu et al [16] use policy distillation [22] to achieve fixed agent transfer learning.…”

Section: Introductionmentioning

confidence: 99%

“…However, the agent population varies in different tasks in most cases. To solve this problem, DyAN [29] uses a graph neural network to adapt to dynamic agent population. UPDeT [11] uses Transformer [24] to realize a universal and transferable agent policy network to achieve agent-level knowledge transfer.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Zhou¹,

Zhang²,

Shao³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Extending transfer learning to cooperative multi-agent reinforcement learning (MARL) has recently received much attention. In contrast to the single-agent setting, the coordination indispensable in cooperative MARL constrains each agent's policy. However, existing transfer methods focus exclusively on agent policy and ignores coordination knowledge. We propose a new architecture that realizes robust coordination knowledge transfer through appropriate decomposition of the overall coordination into several coordination patterns. We use a novel mixing network named level-adaptive QTransformer (LA-QTransformer) to realize agent coordination that considers credit assignment, with appropriate coordination patterns for different agents realized by a novel level-adaptive Transformer (LA-Transformer) dedicated to the transfer of coordination knowledge. In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios. Extensive experiments in StarCraft II micro-management show that LA-QTransformer together with PIT achieves superior performance compared with state-of-the-art baselines.

show abstract

Section: Baseline Performancementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Zhou¹,

Zhang²,

Shao³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, current methods have poor representation learning ability and fail to exploit the common structure underlying the tasks this is because they tend to treat observation from different entities in the environment as an integral part of the whole. Worse yet, conventional models require the input and the output dimensions to be fixed ( [10], [11]), which makes zero-shot transfer impossible. Thus, the application of current methods is limited in real-world applications.…”

Section: Introductionmentioning

confidence: 99%

Transformer Based Multi-Agent Framework

Zhu

Chang

et al. 2021

2021 IEEE International Conference on Multimedia &Amp; Expo Workshops (ICMEW)

View full text Add to dashboard Cite

We present a Transformer-like agent to learn the policy of multi-agent cooperation tasks, which is a breakthrough for traditional RNN-based multi-agent models that need to be retrained for each task. Our model can handle various input and output with strong transferability and can parallel tackle different tasks. Besides, We are the first to successfully utilize transformer into a recurrent architecture, providing insight on stabilizing transformers in recurrent RL tasks.

show abstract

“…Algorithm 1 MAPPO Initialize θ, the parameters for policy π and φ, the parameters for critic V , using Orthogonal initialization (Hu et al, 2020) Set learning rate α while step ≤ step max do set data buffer D = {} for i = 1 to num_rollouts do τ = [] empty list for t = 1 to T do for all agents a do p We use the neural SLAM module and the local policy and directly use the trained model provided in origin ANS paper (Chaplot et al, 2020a).…”

mentioning

confidence: 99%

Learning Efficient Multi-Agent Cooperative Visual Exploration

Yu¹,

Yang²,

Gao³

et al. 2021

Preprint

View full text Add to dashboard Cite

We consider the task of visual indoor exploration with multiple agents, where the agents need to cooperatively explore the entire indoor region using as few steps as possible. Classical planning-based methods often suffer from particularly expensive computation at each inference step and a limited expressiveness of cooperation strategy. By contrast, reinforcement learning (RL) has become a trending paradigm for tackling this challenge due to its modeling capability of arbitrarily complex strategies and minimal inference overhead. We extend the state-of-the-art singleagent RL solution, Active Neural SLAM (ANS), to the multi-agent setting by introducing a novel RL-based global-goal planner, Spatial Coordination Planner (SCP), which leverages spatial information from each individual agent in an end-toend manner and effectively guides the agents to navigate towards different spatial goals with high exploration efficiency. SCP consists of a transformer-based relation encoder to capture intra-agent interactions and a spatial action decoder to produce accurate goals. In addition, we also implement a few multi-agent enhancements to process local information from each agent for an aligned spatial representation and more precise planning. Our final solution, Multi-Agent Active Neural SLAM (MAANS), combines all these techniques and substantially outperforms 4 different planning-based methods and various RL baselines in the photo-realistic physical testbed, Habitat.

show abstract

From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning

Cited by 74 publications

References 9 publications

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Transformer Based Multi-Agent Framework

Learning Efficient Multi-Agent Cooperative Visual Exploration

Contact Info

Product

Resources

About