2020
DOI: 10.1609/aaai.v34i05.6205
|View full text |Cite
|
Sign up to set email alerts
|

Communication Learning via Backpropagation in Discrete Channels with Unknown Noise

Abstract: This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework. However, modern communication networks (e.g., Wi-Fi or Bluetooth) rely on discrete communication channels, for which existing differentiable approaches that conside… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
23
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 11 publications
(24 citation statements)
references
References 12 publications
1
23
0
Order By: Relevance
“…Agent-Entity Graph [29] also uses distance to measure nearby agents while as long as two agents are close to each other will be allowed to communicate. LSC [45] enables agents within a cluster [38]; MAGNet-SA-GS-MG [46]; Agent-Entity Graph [29]; LSC [45]; NeurComm [32]; IP [33]; FlowComm [14]; GAXNet [44]; Other Agents DIAL [11]; RIAL [11]; CommNet [12]; BiCNet [36]; TarMAC [20]; MADDPG-M [47]; IC3Net [24]; SchedNet [25]; DCC-MD [48]; VBC [39]; Diff Discrete [49]; I2C [28]; IS [34]; ETCNet [27]; Variablelength Coding [43]; TMC [40]; Proxy MS-MARL-GCM [50]; ATOC [23]; MD-MADDPG [37]; IMAC [21]; GA-Comm [22]; Gated-ACML [26]; HAMMER [51]; MAGIC [13]; radius to decide whether to become a leader agent. Then all non-leader agents in a cluster will communicate with the only one leader agent.…”
Section: Communicatee Typementioning
confidence: 99%
See 2 more Smart Citations
“…Agent-Entity Graph [29] also uses distance to measure nearby agents while as long as two agents are close to each other will be allowed to communicate. LSC [45] enables agents within a cluster [38]; MAGNet-SA-GS-MG [46]; Agent-Entity Graph [29]; LSC [45]; NeurComm [32]; IP [33]; FlowComm [14]; GAXNet [44]; Other Agents DIAL [11]; RIAL [11]; CommNet [12]; BiCNet [36]; TarMAC [20]; MADDPG-M [47]; IC3Net [24]; SchedNet [25]; DCC-MD [48]; VBC [39]; Diff Discrete [49]; I2C [28]; IS [34]; ETCNet [27]; Variablelength Coding [43]; TMC [40]; Proxy MS-MARL-GCM [50]; ATOC [23]; MD-MADDPG [37]; IMAC [21]; GA-Comm [22]; Gated-ACML [26]; HAMMER [51]; MAGIC [13]; radius to decide whether to become a leader agent. Then all non-leader agents in a cluster will communicate with the only one leader agent.…”
Section: Communicatee Typementioning
confidence: 99%
“…DIAL [11], RIAL [11], CommNet [12], and BiCNet [36] learn a communication protocol which connect all agents together. Diff Discrete [49] and Variable-length Coding [43] consider two-agent cases while do not learn to block messages from each other. TarMAC [20] and IS [34] learn meaningful messages while using a broadcast way to share messages thus still using full communication.…”
Section: Communication Policymentioning
confidence: 99%
See 1 more Smart Citation
“…5.2.1 Multi-Agent Pathfinding with Individual Rewards. To further illustrate the robustness of FCMNet, we consider the simple partiallyobservable, cooperative multi-agent pathfinding task introduced in [9] (Hidden-Goal Path-Finding), with a team of 𝑛 = 5 agents. In this task, each agent has a unique target location it needs to reach as soon as possible, and whose position may change randomly at every timestep.…”
Section: Robustness Experimentsmentioning
confidence: 99%
“…We experimentally evaluate our proposed model on a range of unit micromanagement tasks in the StarCraft II Multi-Agent Challenge [21], as well as on a partially-observable multi-agent pathfinding task [9]. Our results show that FCMNet outperforms existing state-of-the-art CL methods in all StarCraft II micromanagement tasks and value decomposition methods in certain tasks.…”
Section: Introductionmentioning
confidence: 99%