Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022
DOI: 10.24963/ijcai.2022/85
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Agent Concentrative Coordination with Decentralized Task Representation

Abstract: Value-based multi-agent reinforcement learning (MARL) methods hold the promise of promoting coordination in cooperative settings. Popular MARL methods mainly focus on the scalability or the representational capacity of value functions. Such a learning paradigm can reduce agents' uncertainties and promote coordination. However, they fail to leverage the task structure decomposability, which generally exists in real-world multi-agent systems (MASs), leading to a significant amount of time exploring the optimal p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
1
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 0 publications
0
1
0
Order By: Relevance
“…LILAC [9] learns a leader to assign roles. Another line of work such as [38,39,28,13], divides the agents into some groups that carry out similar sub-tasks with a specific policy or value function. In our work, we learn more stable and distinguishable group embeddings and further consider the integration of team-level strategy and individual-level decision.…”
Section: Related Workmentioning
confidence: 99%
“…LILAC [9] learns a leader to assign roles. Another line of work such as [38,39,28,13], divides the agents into some groups that carry out similar sub-tasks with a specific policy or value function. In our work, we learn more stable and distinguishable group embeddings and further consider the integration of team-level strategy and individual-level decision.…”
Section: Related Workmentioning
confidence: 99%
“…B2MAPO framework provides a universal modulized plug-andplay architecture, which can conveniently integrate third-party models with few or even without modification to exploit their merits. "CTDE-based joint policy" module in this layer can be implemented by various MARL methods, such as Q-learning-based methods [30,35,39], policy-based methods [41], and actor-critic methods [44].…”
Section: Batch By Batch Policy Optimization Layermentioning
confidence: 99%
“…Currently, in the field of multi-agent reinforcement learning (MARL), three main Centralized Training and Decentralized Execution (CTDE) [8,10] paradigms are predominantly followed: multi-agent policy gradient methods [41], value decomposition methods [30,35,39], and actor-critic methods [44]. These approaches typically assume independence among agents' policies and update all agents simultaneously.…”
Section: Related Workmentioning
confidence: 99%
“…For better coordination on further applications, some issues like non-stationarity [47], scalability [4] remain to be solved. To solve the non-stationarity caused by the concurrent learning of multiple policies and scalability as the agent number increases, most recent works on MARL adopt the Centralized Training and Decentralized Execution (CTDE) [28,39] paradigm, which includes both value-based methods [57,52,64,76] and policy gradient methods [16,38,68,75], or other techniques like transformer [69]. Under the CTDE paradigm, however, the coordination ability of the learned policies can be fragile due to the partial observability in the multiagent environment, which is a common challenge in many multi-agent tasks [41].…”
Section: Introductionmentioning
confidence: 99%