2016
DOI: 10.1007/978-3-319-46687-3_2
|View full text |Cite
|
Sign up to set email alerts
|

Deep Q-Learning with Prioritized Sampling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
15
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 3 publications
0
15
0
Order By: Relevance
“…Using only uniform sampling as a way to store experiences in the replay memory proved to have limitations such as that some of the valuable experiences might never be replayed [5]. Attention-based replay memory keeps the uniform sampling and extends it by additionally sampling the experiences that emerged from a specific type of interaction.…”
Section: Model Architecture and Learning Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…Using only uniform sampling as a way to store experiences in the replay memory proved to have limitations such as that some of the valuable experiences might never be replayed [5]. Attention-based replay memory keeps the uniform sampling and extends it by additionally sampling the experiences that emerged from a specific type of interaction.…”
Section: Model Architecture and Learning Algorithmmentioning
confidence: 99%
“…Previous approaches have dealt with the dynamics of the replay memory mechanism in order to improve the speed of learning by focusing on the transitions that had a larger TD error in both experience sampling [5] and experience replay [4], but none was concerned about modifying the characteristics of the learning process itself.…”
Section: Introductionmentioning
confidence: 99%
“…Recent implementations, such as [9], [10], include a memory buffer called replay memory that is functionally similar to the human working memory: it selectively stores the experiences, or transitions, in order to replay and re-learn from them off-line, therefore reducing the amount of data that should be acquired by expensive processes, and ensuring at the same time a more stable training of the approximator needed to manage continuous variables. Later approaches that dealt with the mechanism of replay memory were interested in improving the speed of learning by focusing the attention on specific transitions that are more valuable to the learning process, using criteria such as temporal difference error [11], received reinforcement [12], and information potential of the state [13]. Usually, in machine learning, the agent prefers unexpected experiences as they are more likely to "surprise" the predictor and feed the learning process in order to further reduce the uncertainty about the environment [14], [15].…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning is capable of capturing high‐level features from basic signals. Recently, Zhai et al () combined RL with deep learning to provide advantages of both, they called it deep RL. Deep Q‐Learning is one of the deep RL methods that combines Q‐Learning in RL with a deep neural network.…”
Section: Introductionmentioning
confidence: 99%