2012
DOI: 10.1007/978-3-642-30353-1_31
|View full text |Cite
|
Sign up to set email alerts
|

Curriculum Learning for Motor Skills

Abstract: Abstract. Humans and animals acquire their wide repertoire of motor skills through an incremental learning process, during which progressively more complex skills are acquired and subsequently integrated with prior abilities. The order in which the skills are learned and the progressive manner in which they are developed play an important role in developing a final skill set. Inspired by this general idea, we develop an approach for learning motor skills based on a two-level curriculum. At the high level, the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 47 publications
(31 citation statements)
references
References 13 publications
0
30
0
1
Order By: Relevance
“…In practice, it is helpful to regularly estimate the local gradient of the next landscape before each transition. Ideally, we would want to design a sequence of training wheels that funnel into each other, similar in concept to [16] [20]. In this work, when to switch between environments was chosen heuristically.…”
Section: Discussionmentioning
confidence: 99%
“…In practice, it is helpful to regularly estimate the local gradient of the next landscape before each transition. Ideally, we would want to design a sequence of training wheels that funnel into each other, similar in concept to [16] [20]. In this work, when to switch between environments was chosen heuristically.…”
Section: Discussionmentioning
confidence: 99%
“…Alternatively, PG methods can operate in continuous or discrete action spaces ( DeepMind Technologies, 2014 ) and are becoming the preferred choice for reinforcement learning tasks ( Karpathy, 2016 ). Karpathy suggested that the reason PG methods are becoming favoured is because it is an end-to-end method: there’s an explicit policy and a principled approach that directly optimizes the expected reward ( Karpathy, 2016 ). Instead of estimating the future reward for every state-action pair based upon the data points collected, we estimate the future reward of the policy based on the policy parameters.…”
Section: Methodsmentioning
confidence: 99%
“…In the last couple of decades, the field of Deep Reinforcement Learning (DRL) has established itself design process can be either domain-expert specified [8] or automatic [9]. Domain-expert specified curricula rely heavily on human knowledge and thus lack new knowledge discovery, scalability, and robustness to unseen scenarios and may exhibit bias towards certain tasks.…”
Section: Reinforcement Learningmentioning
confidence: 99%