2012
DOI: 10.1016/j.automatica.2012.06.096
|View full text |Cite
|
Sign up to set email alerts
|

Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
445
0
1

Year Published

2012
2012
2022
2022

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 830 publications
(446 citation statements)
references
References 25 publications
0
445
0
1
Order By: Relevance
“…This last condition, called excitability of the pair (S, ω(0)), is a geometric characterization of the property that the signals generated by (3) are persistently exciting, see [38]. The excitability of the pair (S, ω(0)) can be also connected to the notion of exploration noise used in the data-driven dynamic programming problem, see [39]- [41]. However, note that while it is not clear how to select the exploration noise in optimal control problems, the excitability condition provides a precise characterization of the frequencies that have to be present in the excitation signal.…”
Section: A Model Reduction By Moment Matching -Recalledmentioning
confidence: 99%
“…This last condition, called excitability of the pair (S, ω(0)), is a geometric characterization of the property that the signals generated by (3) are persistently exciting, see [38]. The excitability of the pair (S, ω(0)) can be also connected to the notion of exploration noise used in the data-driven dynamic programming problem, see [39]- [41]. However, note that while it is not clear how to select the exploration noise in optimal control problems, the excitability condition provides a precise characterization of the frequencies that have to be present in the excitation signal.…”
Section: A Model Reduction By Moment Matching -Recalledmentioning
confidence: 99%
“…In this case p can be taken larger than nν and, as well-known from linear algebra and remarked in [25] and [34], the solution of equation (11) is the least squares solution of (10).…”
Section: A Preliminary Analysismentioning
confidence: 99%
“…However, in practice a model of the system to be reduced is not always available. In this paper, inspired by the learning algorithm given in [25] to solve a model-free adaptive dynamic programming problem (see also the references therein, e.g. [26], [27]), we propose an on-line algorithm for the model reduction of linear systems and linear time-delay systems from data.…”
Section: Introductionmentioning
confidence: 99%
“…In [19], the algorithm of policy improvement with path integrals is integrated with reinforcement learning to achieve variable impedance control. In [20], impedance adaptation for robot control is developed based on adaptive dynamic programming proposed in [21]. Literature reviews of reinforcement learning can be found in [22,23], which introduce the use of reinforcement learning in feedback control and state open challenges of developing a reinforcement learning control.…”
Section: Introductionmentioning
confidence: 99%