2014 IEEE/RSJ International Conference on Intelligent Robots and Systems 2014
DOI: 10.1109/iros.2014.6942741
|View full text |Cite
|
Sign up to set email alerts
|

Simultaneous on-line Discovery and Improvement of Robotic Skill options

Abstract: Abstract-The regularity of everyday tasks enables us to reuse existing solutions for task variations. For instance, most door-handles require the same basic skill (reach, grasp, turn, pull), but small adaptations of the basic skill are required to adapt to the variations that exist (e.g. levers vs. knobs). We introduce the algorithm "Simultaneous On-line Discovery and Improvement of Robotic Skills" (SODIRS) that is able to autonomously discover and optimize skill options for such task variations. We formalize … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2015
2015
2017
2017

Publication Types

Select...
3
3

Relationship

3
3

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 10 publications
0
9
0
Order By: Relevance
“…Baranes and Oudeyer (Baranes & Oudeyer, ) have studied the efficiency of combining stochastic optimization to reach goals with maturational mechanisms which progressively grow the limits within which stochastic optimization can physically explore, showing an increase in efficiency from a machine learning point of view. Several works have shown how human demonstration of movements could bootstrap this optimization process (e.g., Stulp, Herlant, Hoarau, & Raiola, ), or how humans can progressively shape subparts of the movements to complement autonomous exploration (Chernova & Thomaz, ). Finally, exploration in infants is also highly driven by mechanisms of intrinsic motivation (also called curiosity), where instead of trying to reach a goal imposed by social peers or the experimenter (as in the model presented in this paper), they use intrinsic criteria such as information gain or surprise to set their own goals and choose how to practice these self‐selected goals (Gottlieb, Oudeyer, Lopes, & Baranes, ; Moulin‐Frier, Nguyen, & Oudeyer, ).…”
Section: General Discussion and Conclusionmentioning
confidence: 99%
See 1 more Smart Citation
“…Baranes and Oudeyer (Baranes & Oudeyer, ) have studied the efficiency of combining stochastic optimization to reach goals with maturational mechanisms which progressively grow the limits within which stochastic optimization can physically explore, showing an increase in efficiency from a machine learning point of view. Several works have shown how human demonstration of movements could bootstrap this optimization process (e.g., Stulp, Herlant, Hoarau, & Raiola, ), or how humans can progressively shape subparts of the movements to complement autonomous exploration (Chernova & Thomaz, ). Finally, exploration in infants is also highly driven by mechanisms of intrinsic motivation (also called curiosity), where instead of trying to reach a goal imposed by social peers or the experimenter (as in the model presented in this paper), they use intrinsic criteria such as information gain or surprise to set their own goals and choose how to practice these self‐selected goals (Gottlieb, Oudeyer, Lopes, & Baranes, ; Moulin‐Frier, Nguyen, & Oudeyer, ).…”
Section: General Discussion and Conclusionmentioning
confidence: 99%
“…Baranes and Oudeyer (Baranes & Oudeyer, 2011) have studied the efficiency of combining stochastic optimization to reach goals with maturational mechanisms which progressively grow the limits within which stochastic optimization can physically explore, showing an increase in efficiency from a machine learning point of view. Several works have shown how human demonstration of movements could bootstrap this optimization process (e.g., Stulp, Herlant, Hoarau, & Raiola, 2014), or how humans can progressively shape subparts of the movements to complement autonomous exploration (Chernova & Thomaz, 2014).…”
Section: Complementarity With Other Mechanismsmentioning
confidence: 99%
“…Direct policy search is a form of reinforcement learning in which the search for the optimal policy is done directly in the space of the parameters θ of a parameterized policy π θ , rather than using a value function. The specific algorithm we use is PI BB (Policy Improvement through Black-Box optimization [22]). Since any model-free direct policy search algorithm could be used to implement this optimization (e.g.…”
Section: Optimization Algorithm: Direct Policy Searchmentioning
confidence: 99%
“…Despite its simplicity, PI BB is able to learn robot skills efficiently and robustly [22]. Alternatively, algorithms such as PIˆ2, PoWER, NES, PGPE, or CMA-ES could be used, see [23,11] for an overview and comparisons.…”
Section: A Policy Improvement Through Black-box Optimizationmentioning
confidence: 99%
“…Next to approaches that have considered finite sets of parameterized problems [4], [12], other approaches [7], [8], [9], [10], [13] have considered the challenge of autonomous exploration and learning of continuous fields of parameterized problems (e.g. discovering and learning all the feasible displacements of objects and their motor solutions).…”
Section: Introductionmentioning
confidence: 99%