Adaptive optimal control for continuous-time linear systems based on policy iteration

Vrabie, Draguna; Păstrăvanu, Octavian; Abu-Khalaf, Murad; Lewis, Frank L.

doi:10.1016/j.automatica.2008.08.017

Cited by 762 publications

(452 citation statements)

References 21 publications

Supporting

Mentioning

449

Contrasting

Unclassified

Order By: Relevance

“…Pastravanu, Abu-Khalaf, Lewis가 제안한 정책반복법 [5], [8], [9]은 시스템의 내부 모델과 상태변수 미분치를 모르는 상황에서도 적용 가능하며, 제어이론 관점 [10], [11]에서 안 정성과 수렴성이 증명된 정책반복법이다. 이와 같은 시스템 정보를 완전히 알지 못하는 상황에서도 적용 가능한, 안정도 와 수렴성이 보장된 정책반복법은 제어이론적 관점으로 볼 때 적응최적 제어기법으로 분류된다 [5], [8].…”

Section: 서 론 정책반복법 (Policy Iteration)은 최적 의사결정 및 최적unclassified

“…즉, 학습을 위해서는 프로빙 잡음을 통해 상태공간을 충분히 탐색해야 하지만, 이는 상태변수의 수렴성을 저해시키는 요인으로 작용하여, 이 둘 사이의 균형 이 필요하다. 하지만, 연속시간 시스템의 내부 모델의 정보 를 모를 때에도 적용 가능한 [5], [8] …”

Section: 서 론 정책반복법 (Policy Iteration)은 최적 의사결정 및 최적unclassified

“…[8]에서 제안한 정책반복법에서는 주어진 입력   에 대해 아래 Bellman 방정식을 통해 유도된 적분식을 기반으 로 (5)를 만족시키는 가치함수  의 를 학습한다 [5], [8], [9].…”

Section: Lq-최적제어 이론 및 수학적 준비unclassified

See 2 more Smart Citations

Explorized Policy Iteration For Continuous-Time Linear Systems

Lee¹,

Chun²,

Choi³

et al. 2012

The Transactions of The Korean Institute of Electrical Engineer

View full text Add to dashboard Cite

-This paper addresses the problem that policy iteration (PI) for continuous-time (CT) systems requires explorations of the state space which is known as persistency of excitation in adaptive control community, and as a result, proposes a PI scheme explorized by an additional probing signal to solve the addressed problem. The proposed PI method efficiently finds in online fashion the related CT linear quadratic (LQ) optimal control without knowing the system matrix A, and guarantees the stability and convergence to the LQ optimal control, which is proven in this paper in the presence of the probing signal. A design method for the probing signal is also presented to balance the exploration of the state space and the control performance. Finally, several simulation results are provided to verify the effectiveness of the proposed explorized PI method.

show abstract

Section: 서 론 정책반복법 (Policy Iteration)은 최적 의사결정 및 최적unclassified

See 1 more Smart Citation

Explorized Policy Iteration For Continuous-Time Linear Systems

Lee¹,

Chun²,

Choi³

et al. 2012

The Transactions of The Korean Institute of Electrical Engineer

View full text Add to dashboard Cite

show abstract

“…Definition 3: Consider system (27) and the signal generator (6). Assume σ(Ā(s)) ⊂ C <0 , system (27) is minimal and suppose Assumptions 1, 3 and 4 hold.…”

Section: Linear Time-delay Systemsmentioning

confidence: 99%

“…In this paper, inspired by the learning algorithm given in [25] to solve a model-free adaptive dynamic programming problem (see also the references therein, e.g. [26], [27]), we propose an on-line algorithm for the model reduction of linear systems and linear time-delay systems from data. Collecting, at a given sequence of time instants t k , timesnapshots (which resemble the ones used to compute a proper orthogonal decomposition (POD), see e.g.…”

Section: Introductionmentioning

confidence: 99%

Model reduction for linear systems and linear time-delay systems from input/output data

Scarciotti

Astolfi

2015

2015 European Control Conference (ECC)

View full text Add to dashboard Cite

Abstract-An algorithm for the estimation of the moments of linear single-input, single-output (SISO) systems and linear time-delay SISO systems from input/output data is proposed. It is proved that the estimate converges to the moments of the system. The estimate is exploited to construct a family of reduced order models. These models asymptotically match the moments of the unknown system to be reduced. Conditions to enforce additional properties, e.g. matching with prescribed eigenvalues or matching with prescribed zeros, upon the reduced order model are provided and discussed. The computational complexity of the algorithm is analyzed and the use of the algorithm is illustrated by a benchmark example.

show abstract