2021
DOI: 10.48550/arxiv.2105.14750
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Active Hierarchical Exploration with Stable Subgoal Representation Learning

Abstract: Goal-conditioned hierarchical reinforcement learning (HRL) serves as a successful approach to solving complex and temporally extended tasks. Recently, its success has been extended to more general settings by concurrently learning hierarchical policies and subgoal representations. However, online subgoal representation learning exacerbates the non-stationary issue of HRL and introduces challenges for exploration in high-level policy learning. In this paper, we propose a state-specific regularization that stabi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(14 citation statements)
references
References 25 publications
0
14
0
Order By: Relevance
“…Figure 4b)), and whether they need a hand-made goal space (x,y) or an implicit curriculum of objectives. We can make two major observations: 1-methods that do not propose diverse goal-states require an implicit curriculum to learn the Ant-Maze task [Li et al 2021b;Nachum et al 2018] (Curriculum column); 2-contrastive representations seem crucial to avoid using a hand-defined goal space like the (x,y) coordinated (Goal space column) [Li et al 2021a;Nachum et al 2019a]. For methods in the "fixing the goal distribution", we did not find a representative and widely used evaluation protocol/environment among works.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…Figure 4b)), and whether they need a hand-made goal space (x,y) or an implicit curriculum of objectives. We can make two major observations: 1-methods that do not propose diverse goal-states require an implicit curriculum to learn the Ant-Maze task [Li et al 2021b;Nachum et al 2018] (Curriculum column); 2-contrastive representations seem crucial to avoid using a hand-defined goal space like the (x,y) coordinated (Goal space column) [Li et al 2021a;Nachum et al 2019a]. For methods in the "fixing the goal distribution", we did not find a representative and widely used evaluation protocol/environment among works.…”
Section: Discussionmentioning
confidence: 99%
“…HESS [Li et al 2021a] partitions the embedding space of LESSON and rewards with a variant of a count-based bonus (see Section 5). It improves exploration in a two-dimensional latent embedding but the size of partitions may not scale well if the agent considers more latent dimensions.…”
Section: Proposing Diverse State-goalsmentioning
confidence: 99%
See 3 more Smart Citations