2019
DOI: 10.1109/access.2019.2917141
|View full text |Cite
|
Sign up to set email alerts
|

Actor-Critic Reinforcement Learning Control of Non-Strict Feedback Nonaffine Dynamic Systems

Abstract: The most focuses of the existing actor-critic reinforcement learning control (ARLC) are on dealing with continuous affine systems or discrete nonaffine systems. In this paper, I propose a new ARLC method for continuous nonaffine dynamic systems subject to unknown dynamics and external disturbances. A new input-to-state stable system is developed to establish an augmented dynamic system, from which I further get a strict-feedback affine model that is convenient for control designing based on a model transformat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2025
2025

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 34 publications
0
8
0
Order By: Relevance
“…The code refers (https://github.com/openai/maddpg). MADDPG is an extension of the actor-critic [29], [32] model. However, MADDPG has to train an independent policy network for each agent, where each agent would learn a policy specializing specific tasks [33] based on its own observation, and the policy network easily overfits to the number of agents.…”
Section: ) Multi-agent Deep Deterministic Policy Gradient Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The code refers (https://github.com/openai/maddpg). MADDPG is an extension of the actor-critic [29], [32] model. However, MADDPG has to train an independent policy network for each agent, where each agent would learn a policy specializing specific tasks [33] based on its own observation, and the policy network easily overfits to the number of agents.…”
Section: ) Multi-agent Deep Deterministic Policy Gradient Methodsmentioning
confidence: 99%
“…(iii) Actor-critic methods aim to combine the advantages of actor-only and critic-only methods. Actor-critic [28], [29] algorithm was generally believed that learning value functions using large, non-linear function approximators was difficult and unstable.…”
Section: ) Hierarchical Communication Mechanismmentioning
confidence: 99%
“…Clearly, there exists a compromise between multiple indices including prescribed performance and other control qualities. In practical applications, we should make a reasonable tradeoff between these indices [117,[157][158][159][160][161][162][163][164][165][166][167][168][169].…”
Section: The Approach For Handing the Perturbations Of Tracking Errorsmentioning
confidence: 99%
“…Shear Impedance Mode Control with Adaptive Fuzzy Compensation for Robot Environment Interaction was investigated by Hu [14]. Actor-Critic Reinforcement Learning Control from a Dynamic Non-Strict Feedback System Nonaffine was investigated by Bu [15]. Sadeq used the optimal control strategy to maximize electric vehicle hybrid energy storage system performance considering topographical information [16].…”
Section: Extended Fuzzy Adaptive Event Trigger Compensationmentioning
confidence: 99%