2023
DOI: 10.1002/rnc.7021
|View full text |Cite
|
Sign up to set email alerts
|

Non‐zero‐sum games of discrete‐time Markov jump systems with unknown dynamics: An off‐policy reinforcement learning method

Xuewen Zhang,
Hao Shen,
Feng Li
et al.

Abstract: This article concentrates on the non‐zero‐sum games problem of discrete‐time Markov jump systems without requiring the system dynamics information. First, the multiplayer non‐zero‐sum games problem can be converted to solve a set of coupled game algebraic Riccati equations, which is difficult to be solved directly. Then, to obtain the optimal control policies, a model‐based algorithm adapting the policy iteration approach is proposed. However, the model‐based algorithm relies on system dynamics information, wh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 48 publications
0
1
0
Order By: Relevance
“…Then, the relevant NZSG problems under adaptive dynamic programming (ADP) were studied. [3][4][5] Liu et al 6 came up with a unique online technique for multi-player NZSGs of nonlinear partially unknown continuous-time systems with control constraints developed based on NNs. Wang et al 7 designed a specific cost function related to tracking errors and their derivatives to solve NZSG problem of continuous-time plants via ADP.…”
Section: Introductionmentioning
confidence: 99%
“…Then, the relevant NZSG problems under adaptive dynamic programming (ADP) were studied. [3][4][5] Liu et al 6 came up with a unique online technique for multi-player NZSGs of nonlinear partially unknown continuous-time systems with control constraints developed based on NNs. Wang et al 7 designed a specific cost function related to tracking errors and their derivatives to solve NZSG problem of continuous-time plants via ADP.…”
Section: Introductionmentioning
confidence: 99%