“…Machine learning has produced considerable recent interest in finite-time performance guarantees for system identification and control. In control, results have focused on finite time regret bounds for the LQR problem with unknown dynamics Dean et al, 2017Dean et al, , 2018Mania et al, 2019;Dean et al, 2019;Cohen et al, 2019), with Simchowitz & Foster (2020) ultimately settling the minimax optimal regret in terms of dimension and time horizon; others have considered regret in online adversarial settings (Agarwal et al, 2019;. Recent results in system identification have focused on obtaining finite time high probability bounds on the estimation error of the system's parameters when observing the evolution over time (Tu et al, 2017;Faradonbeh et al, 2018;Hazan et al, 2018;Hardt et al, 2018;Simchowitz et al, 2018;Sarkar & Rakhlin, 2018;Oymak & Ozay, 2019;Simchowitz et al, 2019;Tsiamis & Pappas, 2019).…”