2019 IEEE Bombay Section Signature Conference (IBSSC) 2019
DOI: 10.1109/ibssc47189.2019.8973068
|View full text |Cite
|
Sign up to set email alerts
|

Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 9 publications
0
2
0
Order By: Relevance
“…These methods try to find the optimal policy function without calculating the value function first. Using gradient-based algorithms such as Proximal Policy Optimization (PPO) [45] and Advantage Actor-Critic (A2C) [46], these methods update the policy parameters judging by their performance. In (3) the equation for updating the policy parameters in gradient ascent, one of the most popular methods is depicted.…”
Section: ) Policy-based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…These methods try to find the optimal policy function without calculating the value function first. Using gradient-based algorithms such as Proximal Policy Optimization (PPO) [45] and Advantage Actor-Critic (A2C) [46], these methods update the policy parameters judging by their performance. In (3) the equation for updating the policy parameters in gradient ascent, one of the most popular methods is depicted.…”
Section: ) Policy-based Methodsmentioning
confidence: 99%
“…However, these methods require more computational power and memory due to utilizing two separate networks. They also suffer from instability, delayed reward, and extended training times due to the correlation problems between actors and critics [46]. The most popular DRL agent types supported by Matlab [50] are listed in Table 2.…”
Section: A Selecting Agent Typementioning
confidence: 99%