2022
DOI: 10.48550/arxiv.2210.06274
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Abstract: We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully perform cooperative tasks with any communication level at execution time by taking advantage of informationsharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized). To formalize our setting, we define a new cl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 12 publications
0
0
0
Order By: Relevance
“…The idea is to learn a policy at simulation time when there is a collective view of the system, and then at runtime use that policy but only with local observations. The typical approach in such cases is based on actor-critic systems [32], [33], [34], [35], where the actor is the distributed policy (with only local information) and the critic is a neural network that takes the overall system state. Mean-field RL [17] is one of such concrete applications of CTDE where the interactions among the population of agents are estimated by considering either the effect of a single agent and the average impact of the entire population or the influence of neighbouring agents.…”
Section: Many-agent Reinforcement Learningmentioning
confidence: 99%
“…The idea is to learn a policy at simulation time when there is a collective view of the system, and then at runtime use that policy but only with local observations. The typical approach in such cases is based on actor-critic systems [32], [33], [34], [35], where the actor is the distributed policy (with only local information) and the critic is a neural network that takes the overall system state. Mean-field RL [17] is one of such concrete applications of CTDE where the interactions among the population of agents are estimated by considering either the effect of a single agent and the average impact of the entire population or the influence of neighbouring agents.…”
Section: Many-agent Reinforcement Learningmentioning
confidence: 99%