2021
DOI: 10.48550/arxiv.2105.01244
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Regret-Optimal Full-Information Control

Abstract: We consider the infinite-horizon, discrete-time fullinformation control problem. Motivated by learning theory, as a criterion for controller design we focus on regret, defined as the difference between the LQR cost of a causal controller (that has only access to past and current disturbances) and the LQR cost of a clairvoyant one (that has also access to future disturbances). In the full-information setting, there is a unique optimal non-causal controller that in terms of LQR cost dominates all other controlle… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

3
2

Authors

Journals

citations
Cited by 8 publications
(29 citation statements)
references
References 6 publications
0
29
0
Order By: Relevance
“…Dynamic regret is a very similar metric to competitive ratio, which we consider in this paper, except that it is the difference between the cost of the online and offline controllers, rather than the ratio of the costs. The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, timevarying setting in [8], in the infinite-horizon LTI setting in [17], and in the measurement-feedback setting in [9]. Gradientbased algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [12], [19].…”
Section: B Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Dynamic regret is a very similar metric to competitive ratio, which we consider in this paper, except that it is the difference between the cost of the online and offline controllers, rather than the ratio of the costs. The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, timevarying setting in [8], in the infinite-horizon LTI setting in [17], and in the measurement-feedback setting in [9]. Gradientbased algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [12], [19].…”
Section: B Related Workmentioning
confidence: 99%
“…A controller whose competitive ratio is bounded above by C offers the following guarantee: the cost it incurs is always at most a factor of C higher than the cost that could have been counterfactually incurred by any other controller, irrespective of the disturbance is generated. Competitive ratio is a multiplicative analog of dynamic regret; the problem of obtaining controllers with optimal dynamic regret was recently considered in [8], [9], [17].…”
Section: Introductionmentioning
confidence: 99%
“…The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, timevarying setting in [5], in the infinite-horizon LTI setting in [12], and in the measurement-feedback setting in [6]. These works all bounded regret by the energy in the disturbances; the pathlength regret bounds we obtain in this paper also imply energy regret bounds which are optimal up to a factor of 4.…”
Section: Related Workmentioning
confidence: 82%
“…These works all bounded regret by the energy in the disturbances; the pathlength regret bounds we obtain in this paper also imply energy regret bounds which are optimal up to a factor of 4. Filtering algorithms with energy regret bounds were obtained in the finite-horizon setting in [5] and the infinite-horizon setting in [12]. Gradient-based control algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [15]; the stronger metric of adaptive regret was studied in [8].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation