2020 59th IEEE Conference on Decision and Control (CDC) 2020
DOI: 10.1109/cdc42340.2020.9304362
|View full text |Cite
|
Sign up to set email alerts
|

Finite-time Identification of Stable Linear Systems Optimality of the Least-Squares Estimator

Abstract: We study contextual bandits with low-rank structure where, in each round, if the (context, arm) pair (i, j) ∈ [m] × [n] is selected, the learner observes a noisy sample of the (i, j)-th entry of an unknown low-rank reward matrix. Successive contexts are generated randomly in an i.i.d. manner and are revealed to the learner. For such bandits, we present efficient algorithms for policy evaluation, best policy identification and regret minimization. For policy evaluation and best policy identification, we show th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
13
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 22 publications
(15 citation statements)
references
References 43 publications
2
13
0
Order By: Relevance
“…Furthermore, when the nominal system is asymptotically stable, i.e., a max < 1, we have λ max (λ T ) m/(1 − a 2 max ) from Corollary 4.1. Combing this bound with (8) shows that the estimation error of each mode is O 1/ |T i | , which matches the optimal decay rate of the LS estimator for asymptotically stable linear systems [4].…”
Section: A Arbitrary Switchingsupporting
confidence: 58%
See 2 more Smart Citations
“…Furthermore, when the nominal system is asymptotically stable, i.e., a max < 1, we have λ max (λ T ) m/(1 − a 2 max ) from Corollary 4.1. Combing this bound with (8) shows that the estimation error of each mode is O 1/ |T i | , which matches the optimal decay rate of the LS estimator for asymptotically stable linear systems [4].…”
Section: A Arbitrary Switchingsupporting
confidence: 58%
“…Note that the derived bounds for the Gramian in the following sections are also applicable to other finite-sample bounds for linear system identification [2], [4], [5], when extended to switched systems with deterministic switching, as the Gramian is an essential object in these bounds.…”
Section: âI = Arg Minmentioning
confidence: 99%
See 1 more Smart Citation
“…By contrast, previous work assumes i.i.d. data or focuses on either linear models [Simchowitz et al, 2018, Tsiamis and Pappas, 2019, Jedra and Proutiere, 2020 or parametric models with known nonlinearities , Sattar and Oymak, 2020, Jain et al, 2021.…”
Section: Discussionmentioning
confidence: 99%
“…Estimation of models (1) and (2) remains relatively poorly understood when the data is not i.i.d., with existing results being limited to when the function f is known to belong to certain parametric classes. In terms of parameter recovery, the LSE converges at a rate of T −1/2 for stable linear autoregressive systems f (x t ) = A x t [Simchowitz et al, 2018, Sarkar and Rakhlin, 2019, Jedra and Proutiere, 2020. The same rate can also be achieved for linear systems with more general input-output behavior Ozay, 2019, Tsiamis andPappas, 2019].…”
Section: Introductionmentioning
confidence: 95%