2018
DOI: 10.48550/arxiv.1803.00781
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration

Abstract: Intrinsically motivated goal exploration algorithms enable machines to discover repertoires of policies that produce a diversity of effects in complex environments. These exploration algorithms have been shown to allow real world robots to acquire skills such as tool use in high-dimensional continuous state and action spaces. However, they have so far assumed that self-generated goals are sampled in a specifically engineered feature space, limiting their autonomy. In this work, we propose to use deep represent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(36 citation statements)
references
References 21 publications
0
33
0
Order By: Relevance
“…Several authors have explored a pre-training stage, sometimes paired with fine-tuning, based on unsupervised representation learning. Péré et al (2018) and Laversanne-Finot et al ( 2018) employ a two-stage framework wherein unsupervised representation learning is used to learn a model of the observations from which to sample goals for control in simple simulated environments. Nair et al (2018) propose a similar approach in the context of model-free Q-learning applied to 3-dimensional simulations and robots.…”
Section: Related Workmentioning
confidence: 99%
“…Several authors have explored a pre-training stage, sometimes paired with fine-tuning, based on unsupervised representation learning. Péré et al (2018) and Laversanne-Finot et al ( 2018) employ a two-stage framework wherein unsupervised representation learning is used to learn a model of the observations from which to sample goals for control in simple simulated environments. Nair et al (2018) propose a similar approach in the context of model-free Q-learning applied to 3-dimensional simulations and robots.…”
Section: Related Workmentioning
confidence: 99%
“…Nevertheless, those methods have not considered the stability of the subgoal representation learning, which results in a non-stationary high-level learning environment. Other prior methods utilize a predefined or pretrained subgoal space [4,5,24,25] to keep the stability of the subgoal representation. However, those methods require taskspecific human knowledge or extra training data.…”
Section: Related Workmentioning
confidence: 99%
“…Goal-conditioned hierarchical reinforcement learning (HRL) has long demonstrated great potential to solve temporally extended tasks with sparse and delayed rewards [1][2][3][4][5], where higher-level policies periodically communicate subgoals to lower-level ones, and lower-level policies are intrinsically rewarded for reaching those subgoals. Early goal-conditioned HRL studies used a hand-designed subgoal space, such as positions of robots [6,7] or objects in images [8].…”
Section: Introductionmentioning
confidence: 99%
“…We consider another way of comparing together the FD spaces, by using the KL-Coverage metric from [39], which was also used in the original AURORA paper [11], to check whether two FD spaces are similar to each other (possibly hinting at their representation capabilities). We extend the original definition to handle multicontainers scenarios:…”
Section: A Pairwise Kl-coveragementioning
confidence: 99%