2022
DOI: 10.1002/oca.2916
|View full text |Cite
|
Sign up to set email alerts
|

Off‐policy integral reinforcement learning‐based optimal tracking control for a class of nonzero‐sum game systems with unknown dynamics

Abstract: This article studies the optimal tracking control problem of a class of multi‐input nonlinear system with unknown dynamics based on reinforcement learning (RL) and nonzero‐sum game theory. First of all, an augmented system composed of the tracking error dynamics and the command generator dynamics is constructed. Then, a tracking coupled Hamilton–Jacobi (HJ) equations associated with discounted cost function is derived, which gives the Nash equilibrium solution. The existence of Nash equilibrium is proved. To a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 16 publications
(3 citation statements)
references
References 61 publications
0
3
0
Order By: Relevance
“…Meanwhile, the iterative methods of ADP include value iteration (VI) [16][17][18][19] and policy iteration (PI). [20][21][22] Ha et al 23 elaborated a new cost function to develop a VI-based ADP framework to solve the tracking control problem for unknown systems. In Reference 24, a data-driven iterative ADP was proposed to address the nonlinear optimal control problem.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Meanwhile, the iterative methods of ADP include value iteration (VI) [16][17][18][19] and policy iteration (PI). [20][21][22] Ha et al 23 elaborated a new cost function to develop a VI-based ADP framework to solve the tracking control problem for unknown systems. In Reference 24, a data-driven iterative ADP was proposed to address the nonlinear optimal control problem.…”
Section: Introductionmentioning
confidence: 99%
“…Due to the widespread application of ADP, we can observe its extensions into areas such as NZSG, event‐triggered mechanism (ETM), and trajectory tracking. Meanwhile, the iterative methods of ADP include value iteration (VI) 16–19 and policy iteration (PI) 20–22 . Ha et al 23 elaborated a new cost function to develop a VI‐based ADP framework to solve the tracking control problem for unknown systems.…”
Section: Introductionmentioning
confidence: 99%
“…22 Thus, solving coupled Hamilton-Jacobi (HJ) equations is the key to deal with NZS game systems. RL method is applicable to continuous-time NZS game systems 23 and discrete-time NZS game systems. 24,25 The policy iteration (PI) of the ADP method is applied to yield the two-player control laws.…”
Section: Introductionmentioning
confidence: 99%