2022
DOI: 10.48550/arxiv.2205.03819
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods

Abstract: Actor-critic Reinforcement Learning (RL) algorithms have achieved impressive performance in continuous control tasks. However, they still suffer two nontrivial obstacles, i.e., low sample efficiency and overestimation bias. To this end, we propose Simultaneous Double Q-learning with Conservative Advantage Learning (SDQ-CAL). Our SDQ-CAL boosts the Double Q-learning for off-policy actor-critic RL based on a modification of the Bellman optimality operator with Advantage Learning. Specifically, SDQ-CAL improves s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 9 publications
(16 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?