2020
DOI: 10.1109/tcst.2018.2885689
|View full text |Cite
|
Sign up to set email alerts
|

Fitted Q-Function Control Methodology Based on Takagi–Sugeno Systems

Abstract: This paper presents a combined identification/Qfunction fitting methodology, which involves identification of a Takagi-Sugeno model, computation of (sub)optimal controllers from Linear Matrix Inequalities, and subsequent data-based fitting of the Q-function via monotonic optimisation. The LMIbased initialisation provides a conservative solution but it is a sensible starting point to avoid convergence/local-minima issues in raw data-based fitted Q-iteration or Bellman residual minimisation. An inverted-pendulum… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
4
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 40 publications
1
4
0
1
Order By: Relevance
“…This work shows a successful example of a model-free output feedback controller used to collect input-to-state transition samples from the process for learning state-feedback ADP-based ORM tracking control. Therefore it fits with the recent data-driven control [35][36][37][38][39][40][41][42][43] and reinforcement learning [12,44,45] applications.…”
Section: Introductionsupporting
confidence: 56%
See 2 more Smart Citations
“…This work shows a successful example of a model-free output feedback controller used to collect input-to-state transition samples from the process for learning state-feedback ADP-based ORM tracking control. Therefore it fits with the recent data-driven control [35][36][37][38][39][40][41][42][43] and reinforcement learning [12,44,45] applications.…”
Section: Introductionsupporting
confidence: 56%
“…J ∞ MR from ( 5) obtained for γ = 1. The controller (36) will then close the feedback control loop as in…”
Section: Initial Controller With Model-free Vrftmentioning
confidence: 99%
See 1 more Smart Citation
“…However, such methods open up the possibility of getting caught on local minima if they are not properly initialised. In our other work (Díaz et al, 2020), we propose LMI-based solutions as a starting point for Bellman error approaches.…”
Section: Introductionmentioning
confidence: 99%
“…El problema fundamental de la programación dinámica aproximada es que las garantías de optimalidad se pierden al usar aproximadores V(x, θ) y, asimismo, los algoritmos iterativos clásicos PI o VI podrían no converger (Fairbank and Alonso, 2012), por pérdida de contractividad (Busoniu et al, 2010); ello podría requerir cambios en ellos, utilizando, por ejemplo, iteraciones descendiendo por gradiente en lo que se denomina minimización del residuo de Bellman (Antos et al, 2008;Díaz et al, 2018). Como alternativas a las metodologías iterativas, cambiando igualdades por desigualdades en la ecuación de Bellman, reformulaciones del problema como un problema de programación lineal pueden obtener, sin iteraciones, la soluciónóptima en casos de espacios de estado y control discretos, o una aproximación a la misma (De Farias and Van Roy, 2003).…”
Section: Introductionunclassified