2023
DOI: 10.48550/arxiv.2303.10260
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Online Linear Quadratic Tracking with Regret Guarantees

Abstract: Online learning algorithms for dynamical systems provide finite time guarantees for control in the presence of sequentially revealed cost functions. We pose the classical linear quadratic tracking problem in the framework of online optimization where the time-varying reference state is unknown a priori and is revealed after the applied control input. We show the equivalence of this problem to the control of linear systems subject to adversarial disturbances and propose a novel online gradient descent based alg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…The upper bound in Theorem 1 depends linearly on T t=1 ∥ζ t − ζ t−1 ∥, commonly termed path length in the literature [11], [37], which can be interpreted as a measure for the variation of the cost functions. In [5], it was shown for the nominal setting (i.e., without disturbances) that an upper bound that depends linearly on the path length is optimal.…”
Section: Theoretical Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The upper bound in Theorem 1 depends linearly on T t=1 ∥ζ t − ζ t−1 ∥, commonly termed path length in the literature [11], [37], which can be interpreted as a measure for the variation of the cost functions. In [5], it was shown for the nominal setting (i.e., without disturbances) that an upper bound that depends linearly on the path length is optimal.…”
Section: Theoretical Resultsmentioning
confidence: 99%
“…Next, the predicted input sequence ût is updated in step [S5] by a convex combination and will be used for prediction at the next time step t + 1 (trajectory 3 in Figure 1). The convex combination ensures i) constraint satisfaction due to convexity of the constraint sets, and ii) that the predicted input sequence is not updated anymore if it already reaches the estimated optimal steady state θt , since then ẑµ t − ζt = 0, which implies λ t = 0 by (11). The latter avoids degrading closedloop performance in case the virtual input sequence g t in (10) is chosen poorly, e.g., when xt = θt , but g t is chosen such that the state trajectory resulting from application of g t deviates from θt and only returns to the estimated optimal steady θt at the end of the prediction horizon µ.…”
Section: Algorithmmentioning
confidence: 99%
See 1 more Smart Citation