2012
DOI: 10.1049/iet-cta.2010.0521
|View full text |Cite
|
Sign up to set email alerts
|

Approximate dynamic programming for continuous-time linear quadratic regulator problems: relaxation of known input-coupling matrix assumption

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…This process corresponds to the idea of off‐policy in the RL literatures, which refers to the fact that the executed policies are different to the estimated ones. Such mechanism is extended to non‐linear systems in [28, 29]. Another intelligent technique in the field of RL, experience replay (ER), is successfully applied by Modares et al [30] in their adaptive algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…This process corresponds to the idea of off‐policy in the RL literatures, which refers to the fact that the executed policies are different to the estimated ones. Such mechanism is extended to non‐linear systems in [28, 29]. Another intelligent technique in the field of RL, experience replay (ER), is successfully applied by Modares et al [30] in their adaptive algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…However, for some complicated nonlinear systems, the method of choosing the internal parameters is still an intractable problem. One available approach is given in the work of Lee et al (2012), which is only discussed for linear systems.…”
Section: The Critic Network Design and Stability Analysismentioning
confidence: 99%
“…The outer iteration aims to achieve the optimal Q-function. However, the iterative control sequence cannot be obtained directly by solving (29) and (31). Therefore, the inner iteration is necessarily required.…”
Section: Algorithm Derivationmentioning
confidence: 99%
“…Adaptive dynamic programming (ADP), proposed by Werbos [21,22], has demonstrated powerful self-learning capability for optimisation of complex non-linear systems [23][24][25][26][27][28][29][30][31]. Q-learning is a typical method of ADP proposed by Watkins [32,33].…”
Section: Introductionmentioning
confidence: 99%