Jingfeng Wu scite author profile

Understanding the behavior of stochastic gradient descent (SGD) in the context of deep neural networks has raised lots of concerns recently. Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics. Through investigating this general optimization dynamics, we analyze the behavior of SGD on escaping from minima and its regularization effects. A novel indicator is derived to characterize the efficiency of escaping from minima through measuring the alignment of noise covariance and the curvature of loss function. Based on this indicator, two conditions are established to show which type of noise structure is superior to isotropic noise in term of escaping efficiency. We further show that the anisotropic noise in SGD satisfies the two conditions, and thus helps to escape from sharp and poor minima effectively, towards more stable and flat minima that typically generalize well. We systematically design various experiments to verify the benefits of the anisotropic noise, compared with full gradient descent plus isotropic diffusion (i.e. Langevin dynamics).

show abstract

Recognition of insulator explosion based on deep learning

Gao¹,

Wang

Kong

et al. 2017

View full text Add to dashboard Cite

Selective hydrodeoxygenation of lignin-derived phenolics to cycloalkanes over highly stable NiAl2O4 spinel-supported bifunctional catalysts

Zhang

et al. 2022

Chemical Engineering Journal

View full text Add to dashboard Cite

Tangent-Normal Adversarial Regularization for Semi-Supervised Learning

et al. 2019

View full text Add to dashboard Cite

Compared with standard supervised learning, the key difficulty in semi-supervised learning is how to make full use of the unlabeled data. A recently proposed method, virtual adversarial training (VAT), smartly performs adversarial training without label information to impose a local smoothness on the classifier, which is especially beneficial to semi-supervised learning. In this work, we propose tangent-normal adversarial regularization (TNAR) as an extension of VAT by taking the data manifold into consideration. The proposed TNAR is composed by two complementary parts, the tangent adversarial regularization (TAR) and the normal adversarial regularization (NAR). In TAR, VAT is applied along the tangent space of the data manifold, aiming to enforce local invariance of the classifier on the manifold, while in NAR, VAT is performed on the normal space orthogonal to the tangent space, intending to impose robustness on the classifier against the noise causing the observed data deviating from the underlying data manifold. Demonstrated by experiments on both artificial and practical datasets, our proposed TAR and NAR complement with each other, and jointly outperforms other state-of-theart methods for semi-supervised learning.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jingfeng Wu

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

Recognition of insulator explosion based on deep learning

Selective hydrodeoxygenation of lignin-derived phenolics to cycloalkanes over highly stable NiAl2O4 spinel-supported bifunctional catalysts

Tangent-Normal Adversarial Regularization for Semi-Supervised Learning

Contact Info

Product

Resources

About