“…There are also other works that incorporate momentum to accelerate the convergence rate; see e.g., (Chen et al, 2022;Khanduri et al, 2021;Guo and Yang, 2021;. After our initial conference submission, we have also noticed some concurrent works that are relevant to this work; e.g., (Dagreou et al, 2022;Grazzi et al, 2022;Li et al, 2022;Hu et al, 2022). Specifically, (Dagreou et al, 2022) proposed a SBO method with the variance-reduction technique and achieved optimal rate.…”