Robotics: Science and Systems XVI 2020
DOI: 10.15607/rss.2020.xvi.054
|View full text |Cite
|
Sign up to set email alerts
|

Compositional Transfer in Hierarchical Reinforcement Learning

Abstract: The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements. We introduce Regu larized Hierarchical Policy Optimization (RHPO) to improve data-efliciency for domains with multiple dominant tasks and ultimately reduce required platform time. To this end, we employ compositional inductive biases on multiple levels and corresponding mechanisms for sharing off-policy transition data across low-level controllers and tas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
37
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 22 publications
(37 citation statements)
references
References 23 publications
0
37
0
Order By: Relevance
“…This promotes reduced information conditioning for π L than π 0 . Thus, in-line with previous works [14,9], we share the low-level policy π L between policy and prior, and condition only on state s t , ensuring minimal covariate shift and the discovery of instantaneous behaviours that generalise favourably. Unlike prior works, we consider conditioning the high-level prior, π H 0 , on additional information enabling richer skill transfer.…”
Section: Information Asymmetrymentioning
confidence: 70%
See 4 more Smart Citations
“…This promotes reduced information conditioning for π L than π 0 . Thus, in-line with previous works [14,9], we share the low-level policy π L between policy and prior, and condition only on state s t , ensuring minimal covariate shift and the discovery of instantaneous behaviours that generalise favourably. Unlike prior works, we consider conditioning the high-level prior, π H 0 , on additional information enabling richer skill transfer.…”
Section: Information Asymmetrymentioning
confidence: 70%
“…By presenting multiple priors, we enable a comparison with existing literature [14,19,20,21]. With the right masking, one can recover previously investigated asymmetries [14,19], explore additional ones, and also express purely hierarchical [9] and KL-regularized equivalents [8].…”
Section: Information Asymmetrymentioning
confidence: 99%
See 3 more Smart Citations