2021
DOI: 10.1016/j.neucom.2020.11.014
|View full text |Cite
|
Sign up to set email alerts
|

A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…Note that given a tuple of admissible control policies u j (j ∈ N), the value function V k i and the updated policies u k+1 i can be solved, simultaneously, without requiring the system dynamics. (27), and the same updated control policies as (28).…”
Section: Off-policy Irl Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…Note that given a tuple of admissible control policies u j (j ∈ N), the value function V k i and the updated policies u k+1 i can be solved, simultaneously, without requiring the system dynamics. (27), and the same updated control policies as (28).…”
Section: Off-policy Irl Algorithmmentioning
confidence: 99%
“…Reinforcement learning (RL) is a biologically inspired approximation method, which can learn the optimal policy through continuous interaction with the environment 19‐24 . Inspired by the idea of RL, adaptive dynamic programming (ADP) has been proposed and used to solve the optimization problem of uncertain systems with single input effectively 25‐28 . In terms of optimal control, considering unknown disturbances and completely unknown system dynamics, Reference 29 develops an integral reinforcement learning method based on the actor‐critic structure.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The adaptive dynamic programming method is suitable for solving the optimal control of complex nonlinear systems. But this method can only get an approximate optimal solution 2‐4 . The PID method is commonly used methods of ship motion control, and get good performance 5,6 .…”
Section: Introductionmentioning
confidence: 99%
“…But this method can only get an approximate optimal solution. [2][3][4] The PID method is commonly used methods of ship motion control, and get good performance. 5,6 But for the nonlinear of ship motion control, PID has poor robustness and it is easy to be saturation.…”
Section: Introductionmentioning
confidence: 99%