Proceedings of the Genetic and Evolutionary Computation Conference 2022
DOI: 10.1145/3512290.3528845
|View full text |Cite
|
Sign up to set email alerts
|

Diversity policy gradient for sample efficient quality-diversity optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 29 publications
(13 citation statements)
references
References 11 publications
0
12
0
Order By: Relevance
“…In contrast, there are fewer algorithms oriented towards structured action spaces in general, and in the tasks solved by these algorithms, there are no explicit dependencies between atomic actions. Thus, existing approaches are either based on the assumption of independence of the decomposed sub-actions [51,78]; or they are based on the inductive bias to assign a conditional dependency structure to the decomposed sub-actions and pick up the actions one by one through an autoregressive form based on recurrent neural networks, which are finally spliced into the original actions [58,68]. There are also a series of approaches that assume a game relationship between the decomposed sub-actions, model each sub-action as an agent, and use MARL methods to solve them [21,42,52,83,93].…”
Section: A2 Structured or Large-scale Actionsmentioning
confidence: 99%
“…In contrast, there are fewer algorithms oriented towards structured action spaces in general, and in the tasks solved by these algorithms, there are no explicit dependencies between atomic actions. Thus, existing approaches are either based on the assumption of independence of the decomposed sub-actions [51,78]; or they are based on the inductive bias to assign a conditional dependency structure to the decomposed sub-actions and pick up the actions one by one through an autoregressive form based on recurrent neural networks, which are finally spliced into the original actions [58,68]. There are also a series of approaches that assume a game relationship between the decomposed sub-actions, model each sub-action as an agent, and use MARL methods to solve them [21,42,52,83,93].…”
Section: A2 Structured or Large-scale Actionsmentioning
confidence: 99%
“…Policy Gradient Assisted MAP-Elites (PGA-ME) [13]. Since MAP-Elites performs poorly when directly optimizing modern RL controllers [11], [13], [12], several algorithms [11], [13], [12], [39] have scaled it to such controllers. One such algorithm, PGA-ME, replaces the Gaussian noise mutation with two types of operations: (1) gradient ascent, performed with the TD3 algorithm [40], and (2) crossover, performed with a genetic algorithm [1].…”
Section: Background a Map-elitesmentioning
confidence: 99%
“…We consider algorithms in the MAP-Elites family which address the QD-RL problem formulated in Sec. II without making any assumptions on the measures, as opposed to algorithms based on novelty search[54] or algorithms which define measures similarly to an RL reward function[39].…”
mentioning
confidence: 99%
“…QD in particular has been applied to a wide range of domains ranging from robotics [7], [8] to content generation [5], [9] or design [4]. Recently, many QD works have shifted focus toward Neuroevolution and the evolution of large closed-loop controllers for robotics [10]- [13].…”
Section: Introductionmentioning
confidence: 99%