2017
DOI: 10.48550/arxiv.1702.02453
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Preparing for the Unknown: Learning a Universal Policy with Online System Identification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
75
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 52 publications
(75 citation statements)
references
References 9 publications
0
75
0
Order By: Relevance
“…The idea of MEP shares a common intuition with domain randomization, where some features of the environment are changed randomly during training to make the policy robust to that feature [45,50,34,42,1,43]. MEP can be seen as a domain randomization technique, where the randomization is conducted over a set of partners' policies.…”
Section: Related Workmentioning
confidence: 99%
“…The idea of MEP shares a common intuition with domain randomization, where some features of the environment are changed randomly during training to make the policy robust to that feature [45,50,34,42,1,43]. MEP can be seen as a domain randomization technique, where the randomization is conducted over a set of partners' policies.…”
Section: Related Workmentioning
confidence: 99%
“…This type of problem can also be solved by learning a hierarchical policy (Kupcsik et al, 2013), where the upper layer depends on the context and the lower one on the current state of the robot. Similarly (Yu et al, 2017) learns a unique policy in a several environment and use an identification method that feeds a context signal to the policy. The drawback of these methods is that they require access to a wide range of environment in order to learn a latent representation of the dynamics.…”
Section: Related Workmentioning
confidence: 99%
“…Robot learning [1]- [3] has been testified to successfully work not only in simulation but also on real-world robotic control. In this domain, reinforcement learning (RL) [4]- [6] methods are typically applied for robotic control via sim-toreal transfer [2], [7], [8].…”
Section: Introductionmentioning
confidence: 99%
“…SI methods usually configure the physical parameters from historical transition data, either in an explicit or implicit manner. Both approaches have been proven to be feasible for some control tasks [1], [3], when encountering the sim-to-real transfer problems, or more generally, domain transfer problems. However, a successful execution of the task does not necessarily indicate that an optimal control strategy is achieved, which leaves the space for further improvement based on solving the existing defects of above approaches.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation