Deep auto-encoder neural networks in reinforcement learning

Lange, Sascha; Riedmiller, Martin

doi:10.1109/ijcnn.2010.5596468

Cited by 281 publications

(150 citation statements)

References 18 publications

Supporting

Mentioning

149

Contrasting

Unclassified

Order By: Relevance

“…In reinforcement learning, for example, using adaptive unsupervised preprocessors have become increasingly popular (e.g., [22]). However, features generated by algorithms that forget what they learned before, may no longer be valid for parts of the environment the agent has not visited recently.…”

Section: Discussionmentioning

confidence: 99%

Modular deep belief networks that do not forget

Pape

Gomez

Ring

et al. 2011

The 2011 International Joint Conference on Neural Networks

View full text Add to dashboard Cite

Abstract-Deep belief networks (DBNs) are popular for learning compact representations of highdimensional data. However, most approaches so far rely on having a single, complete training set. If the distribution of relevant features changes during subsequent training stages, the features learned in earlier stages are gradually forgotten. Often it is desirable for learning algorithms to retain what they have previously learned, even if the input distribution temporarily changes. This paper introduces the M-DBN, an unsupervised modular DBN that addresses the forgetting problem. M-DBNs are composed of a number of modules that are trained only on samples they best reconstruct. While modularization by itself does not prevent forgetting, the M-DBN additionally uses a learning method that adjusts each module's learning rate proportionally to the fraction of best reconstructed samples. On the MNIST handwritten digit dataset module specialization largely corresponds to the digits discerned by humans. Furthermore, in several learning tasks with changing MNIST digits, MDBNs retain learned features even after those features are removed from the training data, while monolithic DBNs of comparable size forget feature mappings learned before.

show abstract

Section: Discussionmentioning

confidence: 99%

Modular deep belief networks that do not forget

Pape

Gomez

Ring

et al. 2011

The 2011 International Joint Conference on Neural Networks

View full text Add to dashboard Cite

show abstract

“…In the literature, this growing batch approach can be found in several different guises; the number of alternations between episodes of exploration and episodes of learning can be in the whole range of being as close to the pure batch approach as using only two iterations to recal-culating the policy after every few interactions-e.g. after finishing one episode in a shortest-path problem (Kalyanakrishnan and Stone, 2007;Lange and Riedmiller, 2010a). In practice, the growing batch approach is the modeling of choice when applying batch reinforcement learning algorithms to real systems.…”

Section: The Growing Batch Learning Problemmentioning

confidence: 99%

“…The DFQ algorithm has been successfully applied to learning visual control policies in a grid-world benchmark problem-using synthesized (Lange and Riedmiller, 2010b) as well as screen-captured images (Lange and Riedmiller, 2010a)-and to controlling a slot-car racer only on the basis of the raw image data captured by a top-mounted camera (Lange, 2010).…”

Section: Deep Fitted Q Iterationmentioning

confidence: 99%

Batch Reinforcement Learning

Lange

Gabel

Riedmiller

2012

Adaptation, Learning, and Optimization

Self Cite

320

225

View full text Add to dashboard Cite

Batch reinforcement learning is a subfield of dynamic programming-based reinforcement learning. Originally defined as the task of learning the best possible policy from a fixed set of a priori-known transition samples, the (batch) algorithms developed in this field can be easily adapted to the classical online case, where the agent interacts with the environment while learning. Due to the efficient use of collected data and the stability of the learning process, this research area has attracted a lot of attention recently. In this chapter, we introduce the basic principles and the theory behind batch reinforcement learning, describe the most important algorithms, exemplarily discuss ongoing research within this field, and briefly survey real-world applications of batch reinforcement learning.

show abstract

“…Unsupervised learning of deep auto encoder network was integrated into batch-reinforcement learning in [2,3]. The near-optimal policy was demonstrated automatically by learned feature spaces in grid-world like task.…”

Section: Introductionmentioning

confidence: 99%

Hierarchical extreme learning machine based reinforcement learning for goal localization

AlDahoul

Htike

Akmeliawati

2017

IOP Conf. Ser.: Mater. Sci. Eng.

View full text Add to dashboard Cite

Localization and the connection between Uq(so(3)) and Uq(osp(1|2) Abstract. The objective of goal localization is to find the location of goals in noisy environments. Simple actions are performed to move the agent towards the goal. The goal detector should be capable of minimizing the error between the predicted locations and the true ones. Few regions need to be processed by the agent to reduce the computational effort and increase the speed of convergence. In this paper, reinforcement learning (RL) method was utilized to find optimal series of actions to localize the goal region. The visual data, a set of images, is high dimensional unstructured data and needs to be represented efficiently to get a robust detector. Different deep Reinforcement models have already been used to localize a goal but most of them take long time to learn the model. This long learning time results from the weights fine tuning stage that is applied iteratively to find an accurate model. Hierarchical Extreme Learning Machine (H-ELM) was used as a fast deep model that doesn't fine tune the weights. In other words, hidden weights are generated randomly and output weights are calculated analytically. H-ELM algorithm was used in this work to find good features for effective representation. This paper proposes a combination of Hierarchical Extreme learning machine and Reinforcement learning to find an optimal policy directly from visual input. This combination outperforms other methods in terms of accuracy and learning speed. The simulations and results were analysed by using MATLAB.

show abstract

Deep auto-encoder neural networks in reinforcement learning

Cited by 281 publications

References 18 publications

Modular deep belief networks that do not forget

Modular deep belief networks that do not forget

Batch Reinforcement Learning

Hierarchical extreme learning machine based reinforcement learning for goal localization

Contact Info

Product

Resources

About