2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2017
DOI: 10.1109/smc.2017.8122622
|View full text |Cite
|
Sign up to set email alerts
|

A novel DDPG method with prioritized experience replay

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
80
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 185 publications
(81 citation statements)
references
References 11 publications
0
80
0
1
Order By: Relevance
“…Traditional RL samples the experiences from replay buffer with equal importance, which ignores the difference in the value of each experience. Therefore, to improve the learning efficiency and avoid local optima, we adopt the technique of prioritized experience replay in [30]. In prioritized experience replay, the DRL agent draws experiences from replay buffer with weights proportional to their TD-error.…”
Section: ) Prioritized Experience Replaymentioning
confidence: 99%
“…Traditional RL samples the experiences from replay buffer with equal importance, which ignores the difference in the value of each experience. Therefore, to improve the learning efficiency and avoid local optima, we adopt the technique of prioritized experience replay in [30]. In prioritized experience replay, the DRL agent draws experiences from replay buffer with weights proportional to their TD-error.…”
Section: ) Prioritized Experience Replaymentioning
confidence: 99%
“…TD-error is an effective metric of priority and it is easy to obtain in DQNs. Not only does the value-based algorithm use PER, but researchers also try PER on policy-based algorithms, such as DDPG + PER [12]. The metrics used in priority are the same as DQN because DDPG also needs to optimize TD loss.…”
Section: Priority Experience Replaymentioning
confidence: 99%
“…In this work, we use an improved DQN named Double DQN (DDQN) [22] and DDPG [12]. In DDQN, the definition of target value y shown in Equation 3, which solves the over-optimistic value estimation problem.…”
Section: Preliminarymentioning
confidence: 99%
See 2 more Smart Citations