2021
DOI: 10.48550/arxiv.2105.08692
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition

Bo Liu,
Qiang Liu,
Peter Stone
et al.

Abstract: In real-world multiagent systems, agents with different capabilities may join or leave without altering the team's overarching goals. Coordinating teams with such dynamic composition is challenging: the optimal team strategy varies with the composition. We propose COPA, a coach-player framework to tackle this problem. We assume the coach has a global view of the environment and coordinates the players, who only have partial views, by distributing individual strategies. Specifically, we 1) adopt the attention m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…In recent times, Majumdar et al [22] present a line of work where MADDPG agents learn different strategies by de-coupling and automatically weighting individual and global goals in a population-based training paradigm. Liu et al [18] consider the problem of coordination with dynamic composition, where a coach agent with global view distribute individual strategies to dynamically varying player agents. Jiang et al [12] promote sub-task specialization via emergence of individuality through generating dynamic intrinsic rewards.…”
Section: Related Literaturementioning
confidence: 99%
“…In recent times, Majumdar et al [22] present a line of work where MADDPG agents learn different strategies by de-coupling and automatically weighting individual and global goals in a population-based training paradigm. Liu et al [18] consider the problem of coordination with dynamic composition, where a coach agent with global view distribute individual strategies to dynamically varying player agents. Jiang et al [12] promote sub-task specialization via emergence of individuality through generating dynamic intrinsic rewards.…”
Section: Related Literaturementioning
confidence: 99%
“…Recently, [37] present an extension of MADDPG with separately learning individual and global goals in a populationbased training paradigm. [38] tackles the problem of dynamic team composition in coach-player paradigm. However, none of them explicitly address the setting of unequal competition, with focus on effects of incentives in offsetting the inequality.…”
Section: Related Workmentioning
confidence: 99%