Teacher–Student Curriculum Learning

Matiisen, Tambet; Oliver, Avital; Cohen, Taco; Schulman, John

doi:10.1109/tnnls.2019.2934906

Cited by 209 publications

(182 citation statements)

References 29 publications

Supporting

Mentioning

181

Contrasting

Unclassified

Order By: Relevance

“…The environment and framework provide layers of abstraction that facilitate tasks ranging from simple navigation to collaborative problem solving. Due to the nature of the simulation, several works have also investigated lifelong-learning, curriculum learning, and hierarchical planning using Minecraft as a platform (Tessler et al, 2017;Matiisen et al, 2017;Branavan et al, 2012;Oh et al, 2016).…”

Section: Gamesmentioning

confidence: 99%

An Introduction to Deep Reinforcement Learning

François-Lavet

Henderson

Islam

et al. 2018

FNT in Machine Learning

786

230

View full text Add to dashboard Cite

Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decisionmaking tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts. may be may be constrained (e.g., not access to an accurate simulator or limited data).Over the past few years, RL has become increasingly popular due to its success in addressing challenging sequential decision-making problems. Several of these achievements are due to the combination of RL with deep learning techniques (LeCun et al., 2015;Schmidhuber, 2015;Goodfellow et al., 2016). This combination, called deep RL, is most useful in problems with high dimensional state-space. Previous RL approaches had a difficult design issue in the choice of features (Munos and Moore, 2002;Bellemare et al., 2013). However, deep RL has been successful in complicated tasks with lower prior knowledge thanks to its ability to learn different levels of abstractions from data. For instance, a deep RL agent can successfully learn from visual perceptual inputs made up of thousands of pixels (Mnih et al., 2015). This opens up the possibility to mimic some human problem solving capabilities, even in high-dimensional space -which, only a few years ago, was difficult to conceive.Several notable works using deep RL in games have stood out for attaining super-human level in playing Atari games from the pixels (

show abstract

Section: Gamesmentioning

confidence: 99%

An Introduction to Deep Reinforcement Learning

François-Lavet

Henderson

Islam

et al. 2018

FNT in Machine Learning

786

230

View full text Add to dashboard Cite

show abstract

“…In this paper, we propose an algorithm which gives rise to a curriculum in a direct manner. Our approach borrows from curriculum learning in multi-armed bandit setting (Graves et al, 2017;Matiisen et al, 2017), where learning is typically done by measuring the change in a performance criterion of a given agent (i.e. a loss function, score or gradient norm can be used) that appears to affect the form of the optimal policy.…”

Section: Related Workmentioning

confidence: 99%

“…In order to provide meaningful feedback for learning efficient mixtures of discriminators, we consider different reward functions to generate R i (t). We argue that progress (i.e., the learning slope (Graves et al, 2017;Matiisen et al, 2017)) of the generator is a more sensible way to evaluate our policy. Let θ(t) be the generator parameters at episode t. We define the two following quantities for measuring generator progress:…”

Section: Reward Shapingmentioning

confidence: 99%

On-Line Adaptative Curriculum Learning for GANs

Doan

Monteiro

Albuquerque

et al. 2019

AAAI

View full text Add to dashboard Cite

Generative Adversarial Networks (GANs) can successfully approximate a probability distribution and produce realistic samples. However, open questions such as sufficient convergence conditions and mode collapse still persist. In this paper, we build on existing work in the area by proposing a novel framework for training the generator against an ensemble of discriminator networks, which can be seen as a one-student/multiple-teachers setting. We formalize this problem within the full-information adversarial bandit framework, where we evaluate the capability of an algorithm to select mixtures of discriminators for providing the generator with feedback during learning. To this end, we propose a reward function which reflects the progress made by the generator and dynamically update the mixture weights allocated to each discriminator. We also draw connections between our algorithm and stochastic optimization methods and then show that existing approaches using multiple discriminators in literature can be recovered from our framework. We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support. On the other hand, highly expressive discriminators ensure samples quality. Finally, experimental results show that our approach improves samples quality and diversity over existing baselines by effectively learning a curriculum. These results also support the claim that weaker discriminators have higher entropy improving modes coverage.

show abstract

“…This can be done by controlling the order in which examples are introduced [3]. Other approaches use a teacher network to enhance the learning of a student network [12,40]. NETTAILOR uses a replica of the source network, fine-tuned on the target task, as a teacher for the learning of the simplified network.…”

Section: Related Workmentioning

confidence: 99%

NetTailor: Tuning the Architecture, Not Just the Weights

Morgado

Vasconcelos

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Real-world applications of object recognition often require the solution of multiple tasks in a single platform. Under the standard paradigm of network fine-tuning, an entirely new CNN is learned per task, and the final network size is independent of task complexity. This is wasteful, since simple tasks require smaller networks than more complex tasks, and limits the number of tasks that can be solved simultaneously. To address these problems, we propose a transfer learning procedure, denoted NETTAILOR 1 , in which layers of a pre-trained CNN are used as universal blocks that can be combined with small task-specific layers to generate new networks. Besides minimizing classification error, the new network is trained to mimic the internal activations of a strong unconstrained CNN, and minimize its complexity by the combination of 1) a soft-attention mechanism over blocks and 2) complexity regularization constraints. In this way, NETTAILOR can adapt the network architecture, not just its weights, to the target task. Experiments show that networks adapted to simple tasks, such as character or traffic sign recognition, become significantly smaller than those adapted to hard tasks, such as fine-grained recognition. More importantly, due to the modular nature of the procedure, this reduction in network complexity is achieved without compromise of either parameter sharing across tasks, or classification accuracy.

show abstract

Teacher–Student Curriculum Learning

Cited by 209 publications

References 29 publications

An Introduction to Deep Reinforcement Learning

An Introduction to Deep Reinforcement Learning

On-Line Adaptative Curriculum Learning for GANs

NetTailor: Tuning the Architecture, Not Just the Weights

Contact Info

Product

Resources

About