“…As a projection-free algorithm, Frank-Wolfe method [Frank and Wolfe, 1956] has been studied for both convex optimization [Jaggi, 2013, Lacoste-Julien and Jaggi, 2015, Garber and Hazan, 2015, Hazan and Luo, 2016, Mokhtari et al, 2018b and non-convex optimization problems [Lacoste-Julien, 2016, Reddi et al, 2016, Mokhtari et al, 2018c, Shen et al, 2019b. In large-scale settings, distributed FW methods were proposed to solve specific problems, including optimization under block-separable constraint set [Wang et al, 2016], and learning low-rank matrices [Zheng et al, 2018]. The communication-efficient distributed FW variants were proposed for specific sparse learning problems in Bellet et al [2015], Lafond et al [2016], and for general constrained optimization problems in .…”