H ∞ Tracking Control for Linear Discrete-Time Systems: Model-Free Q-Learning Designs

In this article, an output‐feedback Q‐learning algorithm is proposed for the discrete‐time linear system to deal with the H∞$$ {H}_{\infty } $$ tracking control problem. The problem is formulated as a zero‐sum game in the Stackelberg game framework with a discount factor to make the value function bounded. According to the principle of optimality, the game algebraic Riccati equation (GARE) is derived and solved by the Q‐learning algorithm to get the optimal solution of the Stackelberg game without requiring the knowledge of system dynamics and state. It is proved that the solution of the algorithm converges to the optimal control input and the worst‐case disturbance with excitation noises during training, and the Stackelberg strategy can achieve a lower L2$$ {L}_2 $$ disturbance attenuation level than the Nash one. Moreover, the impacts of the discount factor on the stability of the closed‐loop system and solvability of the GARE are analyzed to provide some criteria for the choice of the discount factor. Simulation examples are provided to validate the effectiveness of the algorithm.

{L}_2

Section: Introductionmentioning

confidence: 86%

“…Therefore, this article mainly focuses on using reinforcement learning to solve the

{H}_{\infty }

tracking problem. There have been many studies developed for

{H}_{\infty }

control 30‐34 and

{H}_{\infty }

tracking problems 35‐39 . For the

{H}_{\infty }

control by using reinforcement learning, a fundamental work is in Reference 33, where a model‐free Q‐learning is designed for discrete‐time zero‐sum games.…”

Section: Introductionmentioning

confidence: 99%

{H}_{\infty }

{H}_{\infty }

tracking controller for partially unknown linear continuous‐time systems in Reference 39.…”

Section: Introductionmentioning

confidence: 99%

{w}_k

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Output‐feedback Q‐learning for discrete‐time linear H^∞ tracking control: A Stackelberg game approach

Ren

Wang

Duan

2022

Robust H_∞ tracking of linear discrete‐time systems using Q‐learning

Valadbeigi

Shu

Sedigh

2023

This paper deals with a robust H∞$$ {H}_{\infty } $$ tracking problem with a discounted factor. A new auxiliary system is established in terms of norm‐bounded time‐varying uncertainties. It is shown that the robust discounted H∞$$ {H}_{\infty } $$ tracking problem for the auxiliary system solves the original problem. Then, the new robust discounted H∞$$ {H}_{\infty } $$ tracking problem is represented as a well‐known zero‐sum game problem. Moreover, the robust tracking Bellman equation and the robust tracking Algebraic Riccati equation (RTARE) are inferred. A lower bound of a discounted factor for stability is obtained to assure the stability of the closed‐loop system. Based on the auxiliary system, the system is reshaped in a new structure that is applicable to Reinforcement Learning methods. Finally, an online Q‐learning algorithm without the knowledge of system matrices is proposed to solve the algebraic Riccati equation associated with the robust discounted H∞$$ {H}_{\infty } $$ tracking problem for the auxiliary system. Simulation results are given to verify the effectiveness and merits of the proposed method.

Finite‐horizon H∞ tracking control for discrete‐time linear systems

Wang,

Liang

et al. 2023

In this paper, model‐free finite‐horizon tracking control for discrete‐time linear systems is studied. By formulating an augmented system consisting of the considered linear system and the command generator system, an augmented time‐varying Riccati equation whose solutions can achieve the finite‐horizon tracking control is derived. Then, a time‐varying Q‐function which contains the control input and disturbance input is designed. A time‐varying Q‐function‐based method is developed to learn the solutions of the finite‐horizon tracking control problem without knowing the system model information nor any model identification methods. It is proved that the solutions of the time‐varying Q‐function‐based method converge to the optimal solutions of the augmented time‐varying Riccati equation. At last, simulation examples are provided to show the feasibility and advantages of the time‐varying Q‐function‐based method compared to the infinite‐horizon Q‐learning‐based tracking control method.