2015
DOI: 10.1088/1674-1056/24/9/090504
|View full text |Cite
|
Sign up to set email alerts
|

Off-policy integral reinforcement learning optimal tracking control for continuous-time chaotic systems

Abstract: This paper estimates an off-policy integral reinforcement learning (IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the system data generated by an arbitrary control. Moreover, off-policy IRL can be regarded as a direct learning method, which avoids the identification of system dynamics. In this paper, the performance index function is first given based on the system tracking error and control error. For solving th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…Besides, Wei et al used an algorithm based on reinforcement learning to realize the optimal tracking control for unknown chaotic systems. [22,23] The advantage of the method is its simplicity and easy implementation.…”
Section: Introductionmentioning
confidence: 99%
“…Besides, Wei et al used an algorithm based on reinforcement learning to realize the optimal tracking control for unknown chaotic systems. [22,23] The advantage of the method is its simplicity and easy implementation.…”
Section: Introductionmentioning
confidence: 99%
“…The iterative value functions and control laws are obtained iteratively, and the iterative control laws must stabilize the system. [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24] An initial stabilizing control law is required, however, it is often difficult to obtain. While in most applications, fewer iterations are required, and computationally demanding is more than that of the VI iteration algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…As an online learning method, IRL makes use of the integral terms of the system cost during different time intervals, which can be addressed as a kind of reinforcement information, to estimate the cost functions, and thus, parts of the system dynamics can be unknown . Based on the IRL method, several works() devoted to solving the continuous‐time optimal control problems. However, these works are all based on policy iteration schemes, in which cases the determination of the initial admissible policies is still difficult.…”
Section: Introductionmentioning
confidence: 99%