2018
DOI: 10.48550/arxiv.1810.11187
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TarMAC: Targeted Multi-Agent Communication

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(17 citation statements)
references
References 0 publications
0
17
0
Order By: Relevance
“…Their training algorithm is an extension of the deep recurrent Q-learning (DRQN) (Hausknecht and Stone, 2015), which combines RNN and deep Q-learning (Mnih et al, 2015). Following Foerster et al (2016), various works (Jorge et al, 2016;Sukhbaatar et al, 2016;Havrylov and Titov, 2017;Das et al, 2017;Peng et al, 2017;Mordatch and Abbeel, 2018;Celikyilmaz et al, 2018;Das et al, 2018;Lazaridou et al, 2018;Cogswell et al, 2019) have proposed a variety of neural network architectures to foster communication among agents. These works combine single-agent RL methods with novel developments in deep learning, and demonstrate their performances via empirical studies.…”
Section: Learning To Communicatementioning
confidence: 99%
“…Their training algorithm is an extension of the deep recurrent Q-learning (DRQN) (Hausknecht and Stone, 2015), which combines RNN and deep Q-learning (Mnih et al, 2015). Following Foerster et al (2016), various works (Jorge et al, 2016;Sukhbaatar et al, 2016;Havrylov and Titov, 2017;Das et al, 2017;Peng et al, 2017;Mordatch and Abbeel, 2018;Celikyilmaz et al, 2018;Das et al, 2018;Lazaridou et al, 2018;Cogswell et al, 2019) have proposed a variety of neural network architectures to foster communication among agents. These works combine single-agent RL methods with novel developments in deep learning, and demonstrate their performances via empirical studies.…”
Section: Learning To Communicatementioning
confidence: 99%
“…Attention in this work encodes agents' individual observation before passing centralized communication channel. With centralized value estimate, TarMAC [9] is a targeted communication architecture to generate agents' internal state representations as input of centralized critic. Iqbal & Sha [25] introduced an attention-based critic to select agents in centralized training.…”
Section: Related Workmentioning
confidence: 99%
“…Considering the scale of an agent team in the real scenarios, centralized training is challenging considering training stability and high computation complexity with large computational costs. Jiang & Lu [8] and Das et al [9] both proposed attention based communication protocol to exchange messages in MARL domains [8], [9], but these approaches did not consider agents' behavior knowledge. Communication methods in MARL domains are always designed for solve the problem of efficient sharing partial observation information.…”
Section: Introductionmentioning
confidence: 99%
“…Multi-Actor-Attention-Critic (MAAC) is proposed in [19] to aggregate information using attention mechanism from all the other agents. Similarly, [11], [13], [20] also employ the attention mechanism to learn a representation for the action-value function. However, the communication graphs used there are either dense or ad-hoc (k nearest neighbors), which makes the learning difficult.…”
Section: A Related Workmentioning
confidence: 99%