2021
DOI: 10.48550/arxiv.2112.02618
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning

Abstract: Efficient exploration is important for reinforcement learners (RL) to achieve high rewards. In multi-agent systems, coordinated exploration and behaviour is critical for agents to jointly achieve optimal outcomes. In this paper, we introduce a new general framework for improving coordination and performance of multi-agent reinforcement learners (MARL). Our framework, named Learnable Intrinsic-Reward Generation Selection algorithm (LIGS) introduces an adaptive learner, Generator that observes the agents and lea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…To better be applied in MARL, Some MARL-specific intrinsic reward functions have been proposed, including considering the mutual influence among agents (Chitnis et al 2020;Jaques et al 2019;Wang et al 2019), encouraging agents to reveal or hide their intentions (Strouse et al 2018) and predicting observation with alignment to their neighbors (Ma et al 2022). Besides, Intrinsic rewards without taskoriented bias can increase the diversity of intrinsic reward space, which can be implemented by breaking the extrinsic rewards via credit assignment (Du et al 2019) or using adaptive learners to obtain intrinsic rewards online (Mguni et al 2021). Apart from independent manners to dealing with rewards, EMC (Zheng et al 2021) proposed a curiosity-driven intrinsic reward and introduce an integration way to accomplish the CTDE training paradigm.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…To better be applied in MARL, Some MARL-specific intrinsic reward functions have been proposed, including considering the mutual influence among agents (Chitnis et al 2020;Jaques et al 2019;Wang et al 2019), encouraging agents to reveal or hide their intentions (Strouse et al 2018) and predicting observation with alignment to their neighbors (Ma et al 2022). Besides, Intrinsic rewards without taskoriented bias can increase the diversity of intrinsic reward space, which can be implemented by breaking the extrinsic rewards via credit assignment (Du et al 2019) or using adaptive learners to obtain intrinsic rewards online (Mguni et al 2021). Apart from independent manners to dealing with rewards, EMC (Zheng et al 2021) proposed a curiosity-driven intrinsic reward and introduce an integration way to accomplish the CTDE training paradigm.…”
Section: Related Workmentioning
confidence: 99%
“…Given the CTDE paradigm, massive deep MARL methods have been proposed including VDN (Sunehag et al 2017), QMIX (Rashid et al 2020), QTRAN (Son et al 2019), QPLEX (Wang et al 2020b) and so forth. Their excellent performance can be attributed to the credit assignments, as rewards are critical as the most direct and fundamental instructional signals to drive behaviors (Silver et al 2021;Zheng et al 2021;Mguni et al 2021).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation