2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017
DOI: 10.1109/iros.2017.8205960
|View full text |Cite
|
Sign up to set email alerts
|

Deep dynamic policy programming for robot control with raw images

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…A]. DPP has been extended to a deep learning setting [30], but it is less efficient than DQN 3 [31]. Taking the limit τ → 0, we retrieve Advantage Learning (AL) [5,7] (see Appx.…”
Section: What Happens Under the Hood?mentioning
confidence: 99%
“…A]. DPP has been extended to a deep learning setting [30], but it is less efficient than DQN 3 [31]. Taking the limit τ → 0, we retrieve Advantage Learning (AL) [5,7] (see Appx.…”
Section: What Happens Under the Hood?mentioning
confidence: 99%
“…From a practical point of view, neither SQL nor DPP have been originally implemented in RL on large scale problems. A deep version of a variation of DPP 4 have been proposed by Tsurumine et al [2017], but it is only applied on a small number of samples. The principal issue of a practical DPP is that it has to estimate ψ k , a quantity that is asymptotically unbounded.…”
Section: Related Work and Discussionmentioning
confidence: 99%

Momentum in Reinforcement Learning

Vieillard,
Scherrer,
Pietquin
et al. 2019
Preprint
“…For the initial arm position, we use two different settings. The first (P 1) one uses a single initial arm position for learning as in [7]. This position is made well oriented for a palm-touching purpose.…”
Section: Experimental Protocolmentioning
confidence: 99%
“…1 F. de La Bourdonnaye, C.Teulière and T. Chateau are with the university of Clermont Auvergne, the Pascal Institute CNRS, UMR6602, Aubière, France 2 J.Triesch is with the Frankfurt Institute for Advanced Studies, Frankfurt am Main, Germany in manipulation robotics [3], [4], because they help to discriminate values of close states. For instance, [3], [4], [5], [6] and [7] use a distance measure between the current pose and a target pose to design an informative reward in manipulation tasks such as block stacking, reaching and door pushing or pulling. These rewards require knowledge of robot kinematics and target pose.…”
Section: Introductionmentioning
confidence: 99%