We study stochastic zeroth-order gradient and Hessian estimators for real-valued functions in $\mathbb{R}^n$. We show that, via taking finite difference along random orthogonal directions, the variance of the stochastic finite difference estimators can be significantly reduced. In particular, we design estimators for smooth functions such that, if one uses $ \varTheta \left ( k \right ) $ random directions sampled from the Stiefel manifold $ \text{St} (n,k) $ and finite-difference granularity $\delta $, the variance of the gradient estimator is bounded by $ \mathscr{O} \left ( \left ( \frac{n}{k} - 1 \right ) + \left ( \frac{n^2}{k} - n \right ) \delta ^2 + \frac{ n^2 \delta ^4} { k } \right ) $, and the variance of the Hessian estimator is bounded by $\mathscr{O} \left ( \left ( \frac{n^2}{k^2} - 1 \right ) + \left ( \frac{n^4}{k^2} - n^2 \right ) \delta ^2 + \frac{n^4 \delta ^4 }{k^2} \right ) $. When $k = n$, the variances become negligibly small. In addition, we provide improved bias bounds for the estimators. The bias of both gradient and Hessian estimators for smooth function $f$ is of order $\mathscr{O} \big( \delta ^2 \varGamma \big )$, where $\delta $ is the finite-difference granularity, and $ \varGamma $ depends on high-order derivatives of $f$. Our results are evidenced by empirical observations.
Nesterov's accelerated gradient method (NAG) is widely used in problems with machine learning background including deep learning, and is corresponding to a continuous-time differential equation. From this connection, the property of the differential equation and its numerical approximation can be investigated to improve the accelerated gradient method. In this work we present a new improvement of NAG in terms of stability inspired by numerical analysis. We give the precise order of NAG as a numerical approximation of its continuoustime limit and then present a new method with higher order. We show theoretically that our new method is more stable than NAG for large step size. Experiments of matrix completion and handwriting digit recognition demonstrate that the stability of our new method is better. Furthermore, better stability leads to higher computational speed in experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.