2018
DOI: 10.3389/frobt.2018.00049
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrapping of Parameterized Skills Through Hybrid Optimization in Task and Policy Spaces

Abstract: Modern robotic applications create high demands on adaptation of actions with respect to variance in a given task. Reinforcement learning is able to optimize for these changing conditions, but relearning from scratch is hardly feasible due to the high number of required rollouts. We propose a parameterized skill that generalizes to new actions for changing task parameters, which is encoded as a meta-learner that provides parameters for task-specific dynamic motion primitives. Our work shows that utilizing para… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 38 publications
0
4
0
Order By: Relevance
“…They tested their approach in simulation for 10-, 15-, 20-, and 25-DoF systems. In order to reduce the number of required rollouts for adaptation to new task conditions, Queißer and Steil (2018) used CMA-ES to optimize DMP parameters. In addition, they introduced a hybrid optimization method that combines a fast coarse optimization on a manifold of policy parameters with a fine-grained parameter search in the unrestricted space of actions.…”
Section: Dmps In Application Scenariosmentioning
confidence: 99%
“…They tested their approach in simulation for 10-, 15-, 20-, and 25-DoF systems. In order to reduce the number of required rollouts for adaptation to new task conditions, Queißer and Steil (2018) used CMA-ES to optimize DMP parameters. In addition, they introduced a hybrid optimization method that combines a fast coarse optimization on a manifold of policy parameters with a fine-grained parameter search in the unrestricted space of actions.…”
Section: Dmps In Application Scenariosmentioning
confidence: 99%
“…These sub-policies are treated as latent variables in an expectation-maximization procedure, allowing the distribution of the update information between the sub-policies. In Queisser and Steil [148], an upper-level policy is used to interpolate between policy parameterizations for different task variations. This substantially speeds up learning when many variations of a task must be learned.…”
Section: Trajectory-based Policiesmentioning
confidence: 99%
“…They tested their approach in simulation for 10-, 15-, 20-, and 25-DoFs systems. In order to reduce the number of required rollouts for adaptation to new task conditions, Queißer and Steil (2018) used CMA-ES to optimize DMPs parameters. In addition, they introduced a hybrid optimization method that combines a fast coarse optimization on a manifold of policy parameters with a fine grained parameter search in the unrestricted space of actions.…”
Section: High Dof Robotsmentioning
confidence: 99%