Towards Knowledge Transfer in Deep Reinforcement Learning

Glatt, Ruben; Silva, Felipe Leno da; Costa, Anna Helena Reali

doi:10.1109/bracis.2016.027

Cited by 28 publications

(22 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, transfer methods especially focused on Deep RL scenario might help to scale it to complex MAS applications, since the adaptation of this technique to MAS is still in its first steps (Castaneda, 2016;Gupta et al, 2017b). Two similar investigations concurrently carried out by different groups evaluated the potential of reusing networks in Deep RL tasks (Glatt et al, 2016;Du et al, 2016). Their results are consistent and show that knowledge reuse can greatly benefit the learning process, but recovering from negative transfer when using Deep RL might be even harder.…”

Section: Transfer In Deep Reinforcement Learningmentioning

confidence: 99%

See 1 more Smart Citation

A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems

Silva¹,

Costa²

2019

jair

Self Cite

209

106

View full text Add to dashboard Cite

Multiagent Reinforcement Learning (RL) solves complex tasks that require coordination with other agents through autonomous exploration of the environment. However, learning a complex task from scratch is impractical due to the huge sample complexity of RL algorithms. For this reason, reusing knowledge that can come from previous experience or other agents is indispensable to scale up multiagent RL algorithms. This survey provides a unifying view of the literature on knowledge reuse in multiagent RL. We define a taxonomy of solutions for the general knowledge reuse problem, providing a comprehensive discussion of recent progress on knowledge reuse in Multiagent Systems (MAS) and of techniques for knowledge reuse across agents (that may be actuating in a shared environment or not). We aim at encouraging the community to work towards reusing all the knowledge sources available in a MAS. For that, we provide an in-depth discussion of current lines of research and open questions.

show abstract

Section: Transfer In Deep Reinforcement Learningmentioning

confidence: 99%

“…Even when having very different objectives, games often have similarities (such as using the same buttons for moving the character). However, autonomously computing similarities and mappings between games is still an open problem (Glatt et al, 2016;Du et al, 2016).…”

Section: Video Gamesmentioning

confidence: 99%

A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems

Silva¹,

Costa²

2019

jair

Self Cite

209

106

View full text Add to dashboard Cite

show abstract

“…In supervised learning, transferring parameters from a model pre-trained on ImageNet (Russakovsky et al, 2015) has shown to be an effective way of speeding up image classification in a new data set, especially when the source data set is similar to the target data set (Yosinski et al, 2014). In deep RL, the performance of a target agent can be improved by making use of the knowledge learned in one or more similar source agents (Du et al, 2016;Glatt et al, 2016;Parisotto et al, 2016;Rusu et al, 2016;Teh et al, 2017). All works mentioned above perform pre-training and transfer under the same problem settings, that is, pre-training in supervised learning and transfer to supervised learning, or pre-training in RL and transfer to RL.…”

Section: Related Workmentioning

confidence: 99%

Pre-training with non-expert human demonstration for deep reinforcement learning

Cruz¹,

Du²,

Taylor³

2019

The Knowledge Engineering Review

View full text Add to dashboard Cite

Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by using deep neural networks as function approximators to learn directly from raw input images. However, learning directly from raw images is data inefficient. The agent must learn feature representation of complex states in addition to learning a policy. As a result, deep RL typically suffers from slow learning speeds and often requires a prohibitively large amount of training time and data to reach reasonable performance, making it inapplicable to real-world settings where data is expensive. In this work, we improve data efficiency in deep RL by addressing one of the two learning goals, feature learning. We leverage supervised learning to pre-train on a small set of non-expert human demonstrations and empirically evaluate our approach using the asynchronous advantage actor-critic algorithms (A3C) in the Atari domain. Our results show significant improvements in learning speed, even when the provided demonstration is noisy and of low quality.the Asynchronous Advantage Actor-Critic (A3C) ) algorithm in six Atari games (Bellemare et al. 2013). Unlike previous work where a large amount of expert human data is required to achieve good initial performance boost, our approach shows significant learning speed improvements on all experiments with only a relatively small amount of noisy, non-expert demonstration data. The simplicity of our approach has made it generally adaptable to other deep RL algorithms and potentially to other domains since the collection of demonstration data becomes easy. In addition, we apply Gradient-weighted Class Activation Mapping (Grad-CAM) (Selvaraju et al. 2017) on learned feature maps for both the human data and the agent data, providing a detailed analysis on why pre-training helps to speed up learning. Our work makes the following contributions:1. We show that pre-training on a small amount of non-expert human demonstration data is sufficient to achieve significant performance improvements. 2. We are the first to apply the transformed Bellman (TB) operator (Pohlen et al. 2018) in the A3C algorithm ) and further improve A3C's performance on both baseline and pre-training methods. 3. We propose a modified version of the Grad-CAM method (Selvaraju et al. 2017), which we are the first to provide empirical analysis on what features are learned from pre-training, indicating why pre-training on human demonstration data helps. 4. We release our code and all collected human demonstration data at https://github.com/gabrieledcjr/ DeepRL.This article is organized as the following. In the next section, we review some of the related work in using pre-training to improve data efficiency. Section 3 provides background on deep RL algorithms and the transformed Bellman operator. In Section 4, we propose our pre-training methods for deep RL. Followed by Section 5 where we describe the experimental designs. Results and analysis are presented in Section 6. We conclude this article in Section 7 with discussio...

show abstract

“…In the context of machine learning, transfer learning refers to the the situation where what has been learned in one setting, is used in order to improve generalization in an other, usually similar setting. In the context of DRL, models trained in one domain are used as initial models for training the agent in new, similar domains [29]. It has also been used in transferring knowledge from simulated environments, to physical environments [30].…”

Section: Transfer Learningmentioning

confidence: 99%

Curved Path Following with Deep Reinforcement Learning: Results from Three Vessel Models

Martinsen

Lekkas

2018

OCEANS 2018 MTS/IEEE Charleston

View full text Add to dashboard Cite

This paper proposes a methodology for solving the curved path following problem for underactuated vehicles under unknown ocean current influence using deep reinforcement learning. Three dynamic models of high complexity are employed to simulate the motions of a mariner vessel, a container vessel and a tanker. The policy search algorithm is tasked to find suitable steering policies, without any prior info about the vessels or their environment. First, we train the algorithm to find a policy for tackling the straight line following problem for each of the simulated vessels and then perform transfer learning to extend the policies to the curved-path case. This turns out to be a much faster process compared to training directly for curved paths, while achieving indistinguishable performance. Index Terms-Deep reinforcement learning, path following, transfer learning, marine control systems, unknown disturbances

show abstract

Towards Knowledge Transfer in Deep Reinforcement Learning

Cited by 28 publications

References 11 publications

A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems

A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems

Pre-training with non-expert human demonstration for deep reinforcement learning

Curved Path Following with Deep Reinforcement Learning: Results from Three Vessel Models

Contact Info

Product

Resources

About