2019
DOI: 10.48550/arxiv.1905.01240
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Information asymmetry in KL-regularized RL

Abstract: Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introduces an additional component, a default policy. Instead of relying on a fixed default policy, we learn it from data. But crucially, we restrict the amount of information the default policy receives, forcing it to le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
30
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(30 citation statements)
references
References 20 publications
0
30
0
Order By: Relevance
“…IA can be understood as the masking of in-formation accessible by certain modules. Not conditioning on specific environment aspects forces independence and generalisation across them [8]. In the context of hierarchical KL-regularized RL, the explored asymmetries between the high-level policy, π H , and prior, π H 0 , have been narrow [14,19].…”
Section: Information Asymmetrymentioning
confidence: 99%
See 4 more Smart Citations
“…IA can be understood as the masking of in-formation accessible by certain modules. Not conditioning on specific environment aspects forces independence and generalisation across them [8]. In the context of hierarchical KL-regularized RL, the explored asymmetries between the high-level policy, π H , and prior, π H 0 , have been narrow [14,19].…”
Section: Information Asymmetrymentioning
confidence: 99%
“…By presenting multiple priors, we enable a comparison with existing literature [14,19,20,21]. With the right masking, one can recover previously investigated asymmetries [14,19], explore additional ones, and also express purely hierarchical [9] and KL-regularized equivalents [8].…”
Section: Information Asymmetrymentioning
confidence: 99%
See 3 more Smart Citations