2020
DOI: 10.1609/aaai.v34i05.6212
|View full text |Cite
|
Sign up to set email alerts
|

Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning

Abstract: Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation. Meanwhile, only cognitive consistency within a neighborhood matters because humans only interact directly with their neighbors. Inspired by these observations, we take the first step to introduce neighborhood cognitive consistency (NCC) into multi-agent reinforcement… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
21
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 52 publications
(21 citation statements)
references
References 10 publications
0
21
0
Order By: Relevance
“…Recently, in the direct communication method of embedding communication channels in deep neural networks, the use of specific communication channels can selectively complete the information exchange between various agents [20]. For learning communication protocols, it has been proven to be very effective [21][22][23]. Under normal circumstances, the continuous transmission between agents through the network forms a communication channel, which makes the agents consider local information and global information at the same time during the learning process.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, in the direct communication method of embedding communication channels in deep neural networks, the use of specific communication channels can selectively complete the information exchange between various agents [20]. For learning communication protocols, it has been proven to be very effective [21][22][23]. Under normal circumstances, the continuous transmission between agents through the network forms a communication channel, which makes the agents consider local information and global information at the same time during the learning process.…”
Section: Related Workmentioning
confidence: 99%
“…Weighted QMIX [18] proposes weighted projection to decompose any joint action-value functions. There are other works that investigate into MARL from the perspective of coordination graphs [24][25][26], communication [27,28,10], and role-based learning [12,29].…”
Section: Related Workmentioning
confidence: 99%
“…SEAC [35] partially solves this problem by sharing trajectories only for off-policy training. NCC [28] maintains cognition consistency by representation alignment between neighbors. Roy et al [36] force each agent to predict others' local policies and adds a coach for group experience alignment.…”
Section: Related Workmentioning
confidence: 99%
“…These methods usually bring about some interdisciplinary tricks into MDRL field to help solve MAS problems. Q-value path decomposition (QDP) [27] integrates gradient attribution technique into MDRL to directly decompose global Q-values along trajectory paths to assign credits for agents, it has also been applied in StarCraft II micromanagement tasks with good performance; NCC-MARL [28] introduces neighborhood cognitive consistency (NCC) into MDRL to facilitate large-scale teamwork tasks like football player control. Mao et al [29] propose a novel reward design method to accelerate the formation of better policies, the method is specially designed for packet routing application.…”
Section: Related Workmentioning
confidence: 99%