Haozhao Wang scite author profile

Distributed asynchronous offline training has received widespread attention in recent years because of its high performance on largescale data and complex models. As data are distributed from cloudcentric to edge nodes, a big challenge for distributed machine learning systems is how to handle native and natural non-independent and identically distributed (non-IID) data for training. Previous asynchronous training methods do not have a satisfying performance on non-IID data because it would result in that the training process fluctuates greatly which leads to an abnormal convergence. We propose a gradient scheduling algorithm with partly averaged gradients and global momentum (GSGM) for non-IID data distributed asynchronous training. Our key idea is to apply global momentum and local average to the biased gradient after scheduling, in order to make the training process steady. Experimental results show that for non-IID data training under the same experimental conditions, GSGM on popular optimization algorithms can achieve a 20% increase in training stability with a slight improvement in accuracy on Fashion-Mnist and CIFAR-10 datasets. Meanwhile, when expanding distributed scale on CIFAR-100 dataset that results in sparse data distribution, GSGM can perform a 37% improvement on training stability. Moreover, only GSGM can converge well when the number of computing nodes grows to 30, compared to the state-of-the-art distributed asynchronous algorithms. At the same time, GSGM is robust to different degrees of non-IID data.

show abstract

A Comprehensive Survey on Training Acceleration for Large Machine Learning Models in IoT

Wang

Zhou

et al. 2022

IEEE Internet Things J.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haozhao Wang

Stock Price Prediction Using Time Convolution Long Short-Term Memory Network

Partial Synchronization to Accelerate Federated Learning Over Relay-Assisted Edge Networks

Gradient Scheduling with Global Momentum for Non-IID Data Distributed Asynchronous Training

A Comprehensive Survey on Training Acceleration for Large Machine Learning Models in IoT

Contact Info

Product

Resources

About