2017 IEEE International Conference on Robotics and Automation (ICRA) 2017
DOI: 10.1109/icra.2017.7989385
|View full text |Cite
|
Sign up to set email alerts
|

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

Abstract: Abstract-Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered policy representations and human-supplied demonstrations. Deep reinforcement learning alleviates this limitat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
853
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 1,284 publications
(854 citation statements)
references
References 34 publications
0
853
0
1
Order By: Relevance
“…The fundamental technique in our MAC protocol design is deep reinforcement learning (DRL). DRL is a machine learning technique that combines the decision-making ability of reinforcement learning (RL) [3] and the function approximation ability of deep neural networks [4] to solve complex decisionmaking problems, including game playing, robot control, wireless communications, and network management and control [5][6][7][8][9][10]. In RL/DRL, in each time step, the decision-making agent interacts with its external environment by executing an action.…”
Section: Introductionmentioning
confidence: 99%
“…The fundamental technique in our MAC protocol design is deep reinforcement learning (DRL). DRL is a machine learning technique that combines the decision-making ability of reinforcement learning (RL) [3] and the function approximation ability of deep neural networks [4] to solve complex decisionmaking problems, including game playing, robot control, wireless communications, and network management and control [5][6][7][8][9][10]. In RL/DRL, in each time step, the decision-making agent interacts with its external environment by executing an action.…”
Section: Introductionmentioning
confidence: 99%
“…To overcome this limitation, recent developments combine RL techniques with the significant feature extraction and processing capabilities of deep learning models in a framework known as Deep Q-Network (DQN) [6]. This approach exploits deep neural networks for both feature selection and Q-function approximation, hence enabling unprecedented performance in complex settings such as learning efficient playing strategies from unlabeled video frames of Atari games [7], robotic manipulation [8], and autonomous navigation of aerial [9] and ground vehicles [10].…”
Section: Introductionmentioning
confidence: 99%
“…Recent work on deterministic policy gradients (Lillicrap et al, 2015) and on RL benchmarks, e.g., OpenAI Gym, generally use joint torques as the action space, as do the test suites in recent work (Schulman et al, 2015) on using generalized advantage estimation. Other recent work uses: the PR2 effort control interface as a proxy for torque control ; joint velocities (Gu et al, 2016); velocities under an implicit control policy (Mordatch et al, 2015); or provide abstract actions (Hausknecht & Stone, 2015). Our learning procedures are based on prior work using actorcritic approaches with positive temporal difference updates (Van Hasselt, 2012).…”
Section: Related Workmentioning
confidence: 99%