2016
DOI: 10.1002/oca.2259
|View full text |Cite
|
Sign up to set email alerts
|

Online identifier–actor–critic algorithm for optimal control of nonlinear systems

Abstract: Summary In this paper, a novel identifier–actor–critic optimal control scheme is developed for discrete‐time affine nonlinear systems with uncertainties. In contrast to traditional adaptive dynamic programming methodology, which requires at least partial knowledge of the system dynamics, a neural‐network identifier is employed to learn the unknown control coefficient matrix working together with actor–critic‐based scheme to solve the optimal control online. The critic network learns the approximate value funct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
6
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 44 publications
0
6
0
Order By: Relevance
“…1,2 Actor neural network is another model among the ADP schemes that has been implemented aside the critic to provide a direct approximation of the control policy. 3,4 Moreover in many studies (e.g., References 5 and 6), some non-quadratic functions of the control inputs have been embedded into the performance index to provide constrained optimal laws. However, despite some efforts (e.g., Reference 7), a little progress has been made to consider output/state constraints in the design procedure of the ADP-based controllers.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…1,2 Actor neural network is another model among the ADP schemes that has been implemented aside the critic to provide a direct approximation of the control policy. 3,4 Moreover in many studies (e.g., References 5 and 6), some non-quadratic functions of the control inputs have been embedded into the performance index to provide constrained optimal laws. However, despite some efforts (e.g., Reference 7), a little progress has been made to consider output/state constraints in the design procedure of the ADP-based controllers.…”
Section: Introductionmentioning
confidence: 99%
“…In this method, model uncertainty is usually overcome using neuro‐adaptive identifiers and the states are estimated using neural observers 1,2 . Actor neural network is another model among the ADP schemes that has been implemented aside the critic to provide a direct approximation of the control policy 3,4 . Moreover in many studies (e.g., References 5 and 6), some non‐quadratic functions of the control inputs have been embedded into the performance index to provide constrained optimal laws.…”
Section: Introductionmentioning
confidence: 99%
“…For CT tracking problem, Kamalapurkar, 35 transforming the tracking problem that has a time-varying value function into a time-invariant optimal control problem, uses policy iteration to approximate the optimal controller. There are also available ADP techniques [36][37][38][39][40][41][42][43] to find solutions of optimal tracking problems with partially unknown, completely unknown or uncertain system dynamics. Based on ADP algorithms, there has already existed many related industrial applications like: coal gasification, 44 residential energy systems, 45 microgrid, 46 air conditioning systems, 47 shield tunneling machine 48 and mismatched interconnected nonlinear systems 49 in recent years.…”
Section: Introductionmentioning
confidence: 99%
“…() This theory for linear systems has been highly improved; however, the nonlinear optimal control problem (OCP) has become a strong topic and should be deeper investigated. () The solution schemes for solving nonlinear OCPs are generally classified as direct and indirect methods. The direct methods generally use a discretization or parameterization scheme to transform the OCP into a nonlinear programming one.…”
Section: Introductionmentioning
confidence: 99%