2023
DOI: 10.48550/arxiv.2301.02083
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Self-Motivated Multi-Agent Exploration

Abstract: In cooperative multi-agent reinforcement learning (CMARL), it is critical for agents to achieve a balance between self-exploration and team collaboration. However, agents can hardly accomplish the team task without coordination and they would be trapped in a local optimum where easy cooperation is accessed without enough individual exploration. Recent works mainly concentrate on agents' coordinated exploration, which brings about the exponentially grown exploration of the state space. To address this issue, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 6 publications
0
1
0
Order By: Relevance
“…There exist two major problems: how to design an intrinsic reward to guide agents' unified action tendency and how to integrate the intrinsic rewards into the CTDE framework? In MARL, there are plenty of works designing intrinsic rewards including curiosity-based incentives (Böhmer, Rashid, and Whiteson 2019;Hernandez-Leal, Kartal, and Taylor 2019;Iqbal and Sha 2019;Zhang et al 2023), the mutual influence among agents (Chitnis et al 2020;Jaques et al 2019;Wang et al 2019) and other specific designs (Strouse et al 2018;Ma et al 2022;Mguni et al 2021;Du et al 2019). However, most of them are designed to enhance exploration and employed in independent training ways, which suffer from unstable dynamics of environments.…”
Section: Introductionmentioning
confidence: 99%
“…There exist two major problems: how to design an intrinsic reward to guide agents' unified action tendency and how to integrate the intrinsic rewards into the CTDE framework? In MARL, there are plenty of works designing intrinsic rewards including curiosity-based incentives (Böhmer, Rashid, and Whiteson 2019;Hernandez-Leal, Kartal, and Taylor 2019;Iqbal and Sha 2019;Zhang et al 2023), the mutual influence among agents (Chitnis et al 2020;Jaques et al 2019;Wang et al 2019) and other specific designs (Strouse et al 2018;Ma et al 2022;Mguni et al 2021;Du et al 2019). However, most of them are designed to enhance exploration and employed in independent training ways, which suffer from unstable dynamics of environments.…”
Section: Introductionmentioning
confidence: 99%