Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Beck, Nathan; Rajasekharan, Abhiramon; Tran, Hieu

doi:10.48550/arxiv.2202.02442

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the inverted pendulum problem, some scholars propose to use Q network transfer reinforcement learning to solve it [5]. But it doesn't apply to continuous action space quite well.…”

Section: Introductionmentioning

confidence: 99%

An optimization method for the inverted pendulum problem based on deep reinforcement learning

Zhu

Liu

Feng

et al. 2022

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

The inverted pendulum problem is a classical problem. The inverted pendulum starting at a random position keeps moving upwards and aims to reach an upright position. The problem has been solved through some methods based on deep reinforcement learning (DRL) such as Deep Deterministic Policy Gradient (DDPG). However, DDPG also has disadvantages. Deterministic policy is not conducive to action exploration. Moreover, the Q value needs to be estimated reasonably accurately for the policy to be accurate. Nevertheless, at the beginning of the learning, there is a certain difference in the Q value estimation, and the parameters learned at this time are easy to deviate. Therefore, this paper combining AdaBound with DDPG algorithm proposes an optimization method for the inverted pendulum problem, and compares the performance with that of four published baselines. The experimental results show that for the inverted pendulum problem, the proposed method outperforms the above four baselines to a certain extent.

show abstract

“…For the inverted pendulum problem, some scholars propose to use Q network transfer reinforcement learning to solve it [5]. But it doesn't apply to continuous action space quite well.…”

Section: Introductionmentioning

confidence: 99%