2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2015
DOI: 10.1109/iros.2015.7354297
|View full text |Cite
|
Sign up to set email alerts
|

Learning compound multi-step controllers under unknown dynamics

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 33 publications
(45 citation statements)
references
References 13 publications
0
45
0
Order By: Relevance
“…Baselines. We evaluate forward-backward RL (FBRL) (Han et al, 2015;Eysenbach et al, 2017), a perturbation controller (R3L) (Zhu et al, 2020), value-accelerated persistent RL (VaPRL) (Sharma et al, 2021), a comparison to simply running the base RL algorithm with the biased TD update discussed in Section 6.1 (naïve RL), and finally an oracle (oracle RL) where resets are provided are provided every H E steps (H T is typically three orders of magnitude larger than H E ). We benchmark VaPRL only when demonstrations are available, in accordance to the proposed algorithm in Sharma et al (2021).…”
Section: Evaluation: Setup Metrics Baselines and Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Baselines. We evaluate forward-backward RL (FBRL) (Han et al, 2015;Eysenbach et al, 2017), a perturbation controller (R3L) (Zhu et al, 2020), value-accelerated persistent RL (VaPRL) (Sharma et al, 2021), a comparison to simply running the base RL algorithm with the biased TD update discussed in Section 6.1 (naïve RL), and finally an oracle (oracle RL) where resets are provided are provided every H E steps (H T is typically three orders of magnitude larger than H E ). We benchmark VaPRL only when demonstrations are available, in accordance to the proposed algorithm in Sharma et al (2021).…”
Section: Evaluation: Setup Metrics Baselines and Resultsmentioning
confidence: 99%
“…Reset-free RL has been studied by previous works with a focus on safety (Eysenbach et al, 2017), automated and unattended learning in the real world (Han et al, 2015;Zhu et al, 2020;, skill discovery Lu et al, 2020), and providing a curriculum (Sharma et al, 2021). Strategies to learn reset-free behavior include directly learning a backward reset controller (Eysenbach et al, 2017), learning a set of auxillary tasks that can serve as an approximate reset (Ha et al, 2020;, or using a novelty seeking reset controller (Zhu et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Reinforcement Learning (RL) is increasingly popular in robotics as it facilitates learning control policies through exploration [17,24,36,11,35,40,9,37,23,10]. However, it is well known that efficacy of RL algorithms is highly dependent on how reward functions are specified [32].…”
Section: Introductionmentioning
confidence: 99%