2021
DOI: 10.48550/arxiv.2105.14639
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Shaped Policy Search for Evolutionary Strategies using Waypoints

Abstract: In this paper, we try to improve exploration in Blackbox methods, particularly Evolution strategies (ES), when applied to Reinforcement Learning (RL) problems where intermediate waypoints/subgoals are available. Since Evolutionary strategies are highly parallelizable, instead of extracting just a scalar cumulative reward, we use the state-action pairs from the trajectories obtained during rollouts/evaluations, to learn the dynamics of the agent. The learnt dynamics are then used in the optimization procedure t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 11 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?