In recent years, there have been great advances in the field of decentralized learning with private data. Federated learning (FL) and split learning (SL) are two spearheads possessing their pros and cons, and are suited for many user clients and large models, respectively. To enjoy both benefits, hybrid approaches such as SplitFed have emerged of late, yet their fundamentals have still been illusive. In this work, we first identify the fundamental bottlenecks of SL, and thereby propose a scalable SL framework, coined SGLR. The server under SGLR broadcasts a common gradient averaged at the split-layer, emulating FL without any additional communication across clients as opposed to SplitFed. Meanwhile, SGLR splits the learning rate into its server-side and client-side rates, and separately adjusts them to support many clients in parallel. Simulation results corroborate that SGLR achieves higher accuracy than other baseline SL methods including SplitFed, which is even on par with FL consuming higher energy and communication costs. As a secondary result, we observe greater reduction in leakage of sensitive information via mutual information using SLGR over the baselines.
IntroductionThe recent trend in deep learning has seen exponential growth in terms of architecture sizes Alom et al. (2018). In the computer vision domain, the model sizes over the years have grown larger, as observed in the transition from ResNet and VGG (He et al. 2015;Simonyan and Zisserman 2015) to Inception and DenseNet (Szegedy et al. 2015;Huang et al. 2018). In the field of natural language processing the growth is even more drastic; starting from BERT, RoBERTa, and XLM (Devlin et al. 2019;Liu et al. 2019;Goyal et al. 2021) that have crossed the 100 million parameter mark; and finally reaching Open AI's recent GPT-3 (Brown et al. 2020) standing at staggering 175 billion parameters. The prerequisites for running these large models are huge training data and computing power, making them still limited to a select few.Fortunately for such large models, the current trend in data volume and computing power keeps exponentially increasing in both. Nonetheless, the majority of these data and computing power are sourced from mobile edge devices
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.