2022
DOI: 10.1007/s11071-022-07513-4
|View full text |Cite
|
Sign up to set email alerts
|

Event-triggered optimal containment control for multi-agent systems subject to state constraints via reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(3 citation statements)
references
References 42 publications
0
2
0
Order By: Relevance
“…The utility function is considered a weight that reflects the optimal balance among the control effort, tracking error, and potential disturbances. In [11], the control input and the external disturbance are seen as opposites with different goals, and the optimal controller can be obtained by solving the Hamilton-Jacobi-Isaacs (HJI) equation. Unlike the continuous-time system, the discrete-time Hamilton-Jacobi-Bellman (HJB) equation is 1.…”
Section: Introductionmentioning
confidence: 99%
“…The utility function is considered a weight that reflects the optimal balance among the control effort, tracking error, and potential disturbances. In [11], the control input and the external disturbance are seen as opposites with different goals, and the optimal controller can be obtained by solving the Hamilton-Jacobi-Isaacs (HJI) equation. Unlike the continuous-time system, the discrete-time Hamilton-Jacobi-Bellman (HJB) equation is 1.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, an improved partially unknown system control method was presented in terms of integral reinforcement learning (IRL) in Su et al [18]. So far, there are several similar names for the method of ADP, such as adaptive critic design [17], neuron-dynamic programming [19], reinforcement learning [20][21][22][23], and approximate dynamic programming [24].…”
Section: Introductionmentioning
confidence: 99%
“…In [6], the experience replay was used in combination with ADP, using the past and current data concurrently. In [7]- [9], the adaptive optimal controller was designed by the online actor-critic learning, in order to solve the robust optimal control problem for a class of nonlinear systems. In [10], a model-free λ-policy iteration (λ-PI) was presented for the discrete-time linear quadratic regulation (LQR) problem.…”
mentioning
confidence: 99%