2019
DOI: 10.1002/rnc.4762
|View full text |Cite
|
Sign up to set email alerts
|

H tracking control for linear discrete‐time systems via reinforcement learning

Abstract: Summary In this paper, the H∞ tracking control of linear discrete‐time systems is studied via reinforcement learning. By defining an improved value function, the tracking game algebraic Riccati equation with a discount factor is obtained, which is solved by iteration learning algorithms. In particular, Q‐learning based on value iteration is presented for H∞ tracking control, which does not require the system model information and the initial allowable control policy. In addition, to improve the practicability … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
15
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 29 publications
(15 citation statements)
references
References 31 publications
0
15
0
Order By: Relevance
“…where Q , R and S are all given positive definite matrices. Moreover, if u(t) is designed as (23), then the cost function defined in (22) will be bounded by…”
Section: Application To Unknown Linear Time-delay Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…where Q , R and S are all given positive definite matrices. Moreover, if u(t) is designed as (23), then the cost function defined in (22) will be bounded by…”
Section: Application To Unknown Linear Time-delay Systemsmentioning
confidence: 99%
“…Since the algorithm in Reference 19 for solving the GARE contains two iteration loops and bears a heavy computation burden, a Newton-type single iteration loop policy iteration (PI) based method for solving the GARE was proposed in Reference 8. The idea of Reference 8 was also extended to some other control problems, such as two-player zero-sum game problem, 20 nonlinear H ∞ optimal control problems, 21 tracking problems, 22,23 and so on. 24 Like other PI based iteration methods, it needs a suitable initial state to start.…”
mentioning
confidence: 99%
“…Bhattacharya et al 12 worked on a visibility‐based PE game when the environment contains a circular obstacle. Furthermore, Li et al 13 and Liu et al 14 developed an reinforcement learning (RL) algorithm to learn the Nash equilibrium solution for designing model‐free controller by solving the game algebraic Riccati equation forward in time.…”
Section: Introductionmentioning
confidence: 99%
“…For example, adaptive dynamic programming algorithm is an effective DDC method to deal with time-varying trajectory tracking problem. [33][34][35][36] However, an assumption y d (k + 1) = Fy d (k) is usually provided for the desired time-varying trajectory to construct the augmented system, where F is a constant matrix. This assumption is not required for the indirect data-driven method 37 and MFAC.…”
Section: Introductionmentioning
confidence: 99%