“…A fruitful line of research has focused on how to improve the asymptotic convergence rate as t → ∞ through preconditioning: a technique that involves approximating the unknown Hessian H.θ/ = ∇ 2 θ L.θ/ (see, for instance, Bordes et al (2009) and references therein). Utilizing the curvature information that is reflected by various efficient approximations of the Hessian matrix, stochastic quasi-Newton methods (Moritz et al, 2016;Byrd et al, 2016;Wang et al, 2017;Schraudolph et al, 2007;Mokhtari and Ribeiro, 2015;Becker and Fadili, 2012), Newton sketching or subsampled Newton methods (Pilanci and Wainwright, 2015;Xu et al, 2016;Berahas et al, 2017;Bollapragada et al, 2016) and stochastic approximation of the inverse Hessian via Taylor series expansion (Agarwal et al, 2017) have been proposed to strike a balance between convergence rate and per-iteration complexity.…”