2022
DOI: 10.1109/lra.2021.3135930
|View full text |Cite
|
Sign up to set email alerts
|

Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 7 publications
0
1
0
Order By: Relevance
“…Value factorization approaches include de Witt et al, who introduced COMIX, which employs a decentralizable joint actionvalue function with a per-agent factorization, and showed significant improvement over MADDPG and VDN [97]. Freed et al proposed to rely on learned attention mechanisms to identify and decouple subsets of robots, which reduced the gradient estimator variances in a variety of motion/path planning tasks, showing improvements over independent learning and COMA [98].…”
Section: Motion and Path Planningmentioning
confidence: 99%
“…Value factorization approaches include de Witt et al, who introduced COMIX, which employs a decentralizable joint actionvalue function with a per-agent factorization, and showed significant improvement over MADDPG and VDN [97]. Freed et al proposed to rely on learned attention mechanisms to identify and decouple subsets of robots, which reduced the gradient estimator variances in a variety of motion/path planning tasks, showing improvements over independent learning and COMA [98].…”
Section: Motion and Path Planningmentioning
confidence: 99%