2020
DOI: 10.1109/access.2020.3020590
|View full text |Cite
|
Sign up to set email alerts
|

An Improved DDPG and Its Application Based on the Double-Layer BP Neural Network

Abstract: This paper focused on three application problems of the traditional Deep Deterministic Policy Gradient(DDPG) algorithm. That is, the agent exploration is insufficient, the neural network performance is unsatisfied, the agent output fluctuates greatly. In terms of agent exploration strategy, network training algorithm and overall algorithm implementation, an improved DDPG method based on double-layer BP neural network is proposed. This method introduces fuzzy algorithm and BFGS algorithm based on Armijo-Goldste… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(14 citation statements)
references
References 19 publications
0
11
0
Order By: Relevance
“…Due to the salt cavern exploitation foundation and previous experience of underground gas storage in Jintan, salt caverns are selected to be the UHS formation in the proposed IHES. To simulate the storage process and injection/withdrawal cycles of UHS, the hydrodynamic model of UHS can be denoted by (31).…”
Section: Case Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…Due to the salt cavern exploitation foundation and previous experience of underground gas storage in Jintan, salt caverns are selected to be the UHS formation in the proposed IHES. To simulate the storage process and injection/withdrawal cycles of UHS, the hydrodynamic model of UHS can be denoted by (31).…”
Section: Case Descriptionmentioning
confidence: 99%
“…Deep deterministic policy gradient (DDPG) is a kind of machine learning algorithm for optimisation problems with model‐free structures [30]. Based on the system response data, the controlled systems can achieve adaptive updating, and control strategies can be adaptively optimised [31]. DDPG has been successfully applied to achieve adaptively optimal control in various fields, including Internet‐of‐Things networks [32], energy harvesting [33], and hybrid electric bus [34].…”
Section: Introductionmentioning
confidence: 99%
“…Although DQN has had a strong influence in great dimensional problems, the search space may remain low-key. In this case the Deep Deterministic Policy Gradients (DDPG) [13] off-policy algorithm may be used.…”
Section: Markovian Decision-making Process Modelingmentioning
confidence: 99%
“…Mnih et al [7] proposed the concept of two-layer BP neural network and hence improved the DDPG algorithm. Te search efciency of BP network was improved by using Armijo-Goldstein-based criterion and BFGS method [8]. Nikishin et al [9] reduced the infuence of noise on the gradient by averaging methods under the premise of random weights.…”
Section: Introductionmentioning
confidence: 99%