“…Moreover, our analysis assumes perfect tuning of constants (e.g., D, T, K) for simplicity. In practice, we would prefer to adapt to unknown parameters, motivating new applications and problems for adaptive online learning, which is already an area of active current investigation (see, e.g., Orabona & Pál, 2015;Hoeven et al, 2018;Cutkosky & Orabona, 2018;Cutkosky, 2019;Mhammedi & Koolen, 2020;Chen et al, 2021;Sachs et al, 2022;Wang et al, 2022). It is our hope that some of this expertise can be applied in the non-convex setting as well.…”