2021
DOI: 10.48550/arxiv.2106.10075
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning to Plan via a Multi-Step Policy Regression Method

Abstract: We propose a new approach to increase inference performance in environments that require a specific sequence of actions in order to be solved. This is for example the case for maze environments where ideally an optimal path is determined. Instead of learning a policy for a single step, we want to learn a policy that can predict n actions in advance. Our proposed method called policy horizon regression (PHR) uses knowledge of the environment sampled by A2C to learn an n dimensional policy vector in a policy dis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 10 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?