2018 IEEE International Conference on Robotics and Automation (ICRA) 2018
DOI: 10.1109/icra.2018.8460756
|View full text |Cite
|
Sign up to set email alerts
|

Composable Deep Reinforcement Learning for Robotic Manipulation

Abstract: Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained using soft Q-learning can be applied to real-world robotic manipulation. The application of this method to real-world man… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
1,243
0
4

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 778 publications
(1,252 citation statements)
references
References 28 publications
5
1,243
0
4
Order By: Relevance
“…which incentivizes the policy to explore more widely improving its robustness against perturbations [16]. The temperature parameter α determines the relative importance of the entropy term against the reward, and thus controls the stochasticity of the optimal policy.…”
Section: Reinforcement Learning Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…which incentivizes the policy to explore more widely improving its robustness against perturbations [16]. The temperature parameter α determines the relative importance of the entropy term against the reward, and thus controls the stochasticity of the optimal policy.…”
Section: Reinforcement Learning Preliminariesmentioning
confidence: 99%
“…Learning robotic tasks in the real world requires an algorithm that is sample efficient, robust, and insensitive to the choice of the hyperparameters. Maximum entropy RL is both sample efficient and robust, making it a good candidate for real-world robot learning [16]. However, one of the major challenges of maximum entropy RL is its sensitivity to the temperature parameter, which typically needs to be tuned for each task separately.…”
Section: Automating Entropy Adjustment For Maximum Entropy Rlmentioning
confidence: 99%
“…Recent work in RL for manipulation has tended to take a more tabula rasa approach, focusing on learning policies that output joint torques directly or that output position (and velocity) references to an underlying PD controller. Direct torque control has been used to learn many physical and simulated tasks, including peg insertion, placing a coat hanger, hammering, screwing a bottle cap [6], door opening, pick and place tasks [5], and Lego stacking tasks [20]. Learning position and/or velocity references to a fixed PD joint controller has been used for tasks such as door opening, arXiv:1908.08659v1 [cs.RO] 23 Aug 2019 hammering, object placement [21], Lego stacking [7], and in-hand manipulation [1].…”
Section: Introductionmentioning
confidence: 99%
“…It is, also, possible to incorporate optimization layers (e.g., a QP program [144]) in a neural network in order to take advantage of the structure they provide. Lastly, one can learn distinct soft policies for simpler tasks and then compose them in order to achieve a more complicated task [65].…”
Section: Generalization and Robustnessmentioning
confidence: 99%