2020
DOI: 10.1109/tnnls.2019.2934906
|View full text |Cite
|
Sign up to set email alerts
|

Teacher–Student Curriculum Learning

Abstract: We propose Teacher-Student Curriculum Learning (TSCL), a framework for automatic curriculum learning, where the Student tries to learn a complex task and the Teacher automatically chooses subtasks from a given set for the Student to train on. We describe a family of Teacher algorithms that rely on the intuition that the Student should practice more those tasks on which it makes the fastest progress, i.e. where the slope of the learning curve is highest. In addition, the Teacher algorithms address the problem o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
181
0
1

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 209 publications
(182 citation statements)
references
References 29 publications
0
181
0
1
Order By: Relevance
“…The environment and framework provide layers of abstraction that facilitate tasks ranging from simple navigation to collaborative problem solving. Due to the nature of the simulation, several works have also investigated lifelong-learning, curriculum learning, and hierarchical planning using Minecraft as a platform (Tessler et al, 2017;Matiisen et al, 2017;Branavan et al, 2012;Oh et al, 2016).…”
Section: Gamesmentioning
confidence: 99%
“…The environment and framework provide layers of abstraction that facilitate tasks ranging from simple navigation to collaborative problem solving. Due to the nature of the simulation, several works have also investigated lifelong-learning, curriculum learning, and hierarchical planning using Minecraft as a platform (Tessler et al, 2017;Matiisen et al, 2017;Branavan et al, 2012;Oh et al, 2016).…”
Section: Gamesmentioning
confidence: 99%
“…In this paper, we propose an algorithm which gives rise to a curriculum in a direct manner. Our approach borrows from curriculum learning in multi-armed bandit setting (Graves et al, 2017;Matiisen et al, 2017), where learning is typically done by measuring the change in a performance criterion of a given agent (i.e. a loss function, score or gradient norm can be used) that appears to affect the form of the optimal policy.…”
Section: Related Workmentioning
confidence: 99%
“…In order to provide meaningful feedback for learning efficient mixtures of discriminators, we consider different reward functions to generate R i (t). We argue that progress (i.e., the learning slope (Graves et al, 2017;Matiisen et al, 2017)) of the generator is a more sensible way to evaluate our policy. Let θ(t) be the generator parameters at episode t. We define the two following quantities for measuring generator progress:…”
Section: Reward Shapingmentioning
confidence: 99%
“…This can be done by controlling the order in which examples are introduced [3]. Other approaches use a teacher network to enhance the learning of a student network [12,40]. NETTAILOR uses a replica of the source network, fine-tuned on the target task, as a teacher for the learning of the simplified network.…”
Section: Related Workmentioning
confidence: 99%