2019
DOI: 10.48550/arxiv.1905.00976
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Collaborative Evolutionary Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…These works, while important, do not address the problem of exploring a large state space, and whether this exploration can be improved in multi-agent systems. A recent approach to collaborative evolutionary reinforcement learning [11] shares some similarities with our approach. As in our work, the authors devise a method for learning a population of diverse policies, training using a shared replay buffer among all learners to increase sample efficiency, and dynamically selecting the best learner; however, their work is focused on single-agent tasks and does not incorporate any notion of intrinsic rewards.…”
Section: Related Workmentioning
confidence: 99%
“…These works, while important, do not address the problem of exploring a large state space, and whether this exploration can be improved in multi-agent systems. A recent approach to collaborative evolutionary reinforcement learning [11] shares some similarities with our approach. As in our work, the authors devise a method for learning a population of diverse policies, training using a shared replay buffer among all learners to increase sample efficiency, and dynamically selecting the best learner; however, their work is focused on single-agent tasks and does not incorporate any notion of intrinsic rewards.…”
Section: Related Workmentioning
confidence: 99%
“…Our training algorithm, EGRL, builds on the CERL framework [Khadka et al, 2019] to tackle variable-sized, multi-discrete action settings. Figure 2 illustrates the high level architecture of EGRL.…”
Section: Trainingmentioning
confidence: 99%
“…In addition to the extremely large action space, the reward is end-to-end latency, which is a sparse and noisy learning signal, which we demonstrate is unsuitable for purely gradient-based Deep RL algorithms. Instead, we contribute Evolutionary Graph RL (EGRL), an extension of CERL [Khadka et al, 2019], a population based method which previously performed well in sparse-reward tasks by combining fast policy gradient (PG) learning with a stable evolutionary algorithm (EA). Since the action spaces explored in this paper are several orders of magnitude larger than the ones explored in CERL, we also needed a mechanism to improve the sample-efficiency of the slow EA component.…”
Section: Introductionmentioning
confidence: 99%
“…To address this redundancy issue, we apply the cooperative evolutionary reinforcement learning (CERL) approach (Khadka et al 2019). The key idea is to use different hyperparameter settings for each opponent policy, while use an off-policy learning algorithm and a shared experience replay buffer to keep the advantage of concurrently training multiple policies.…”
Section: Marl With Ensemble Trainingmentioning
confidence: 99%