Adaptive Federated Optimization

Reddi, Sashank J.; Charles, Zachary; Zaheer, Manzil; Garrett, Zachary; Rush, Keith; Konečný, Jakub; Kumar, Sanjiv; McMahan, H. Brendan

doi:10.48550/arxiv.2003.00295

Cited by 157 publications

(351 citation statements)

References 17 publications

Supporting

Mentioning

344

Contrasting

Order By: Relevance

“…This section presents the experimental results of the proposed method, SPIDER in comparison to local adaptation, perFedAvg and Ditto with CIFAR100 dataset, another dataset explored by researchers for Federated learning [23]. All our experiments are based on a non-IID data distribution among FL clients.…”

Section: Experiments On Cifar100 Datasetmentioning

confidence: 99%

See 1 more Smart Citation

SPIDER: Searching Personalized Neural Architecture for Federated Learning

Mushtaq¹,

He²,

Ding³

et al. 2021

Preprint

View full text Add to dashboard Cite

Federated learning (FL) is an efficient learning framework that assists distributed machine learning when data cannot be shared with a centralized server due to privacy and regulatory restrictions. Recent advancements in FL use predefined architecture-based learning for all the clients. However, given that clients' data are invisible to the server and data distributions are non-identical across clients, a predefined architecture discovered in a centralized setting may not be an optimal solution for all the clients in FL. Motivated by this challenge, in this work, we introduce SPIDER, an algorithmic framework that aims to Search PersonalIzed neural architecture for feDERated learning. SPIDER is designed based on two unique features:(1) alternately optimizing one architecture-homogeneous global model (Supernet) in a generic FL manner and one architecture-heterogeneous local model that is connected to the global model by weight sharing-based regularization (2) achieving architecture-heterogeneous local model by a novel neural architecture search (NAS) method that can select optimal subnet progressively using operation-level perturbation on the accuracy value as the criterion. Experimental results demonstrate that SPIDER outperforms other state-of-the-art personalization methods, and the searched personalized architectures are more inference efficient.

show abstract

Section: Experiments On Cifar100 Datasetmentioning

confidence: 99%

“…To address the data-heterogeneity challenge, variants of the standard FedAvg have been proposed to train a global model, including the FedProx [17], FedOPT [23], and FedNova [31]. In addition to training of a global model, frameworks that focus on training personalized models have also gained a lot of popularity.…”

Section: Introductionmentioning

confidence: 99%

SPIDER: Searching Personalized Neural Architecture for Federated Learning

Mushtaq¹,

He²,

Ding³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…And for Assumption. 3, besides the widely applied local gradient bounded variance in FL, we use the global bound σ g to quantify the data-heterogeneity due to the non-i.i.d distributed training dataset, which is also introduced in recent FL studies [19], [36]. Additionally, to illustrate the device-heterogeneity under the formulated systemheterogeneous FL in this paper, we make an extra assumption on the boundary of the approximated gradients from the proposed FedLGA algorithm as the following.…”

Section: Convergence Analysismentioning

confidence: 99%

FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation

Li¹,

Qu²,

Tang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Federated Learning (FL) is a decentralized machine learning architecture, which leverages a large number of remote devices to learn a joint model with distributed training data. However, the system-heterogeneity is one major challenge in a FL network to achieve robust distributed learning performance, which is of two aspects: i) device-heterogeneity due to the diverse computational capacity among devices; ii) data-heterogeneity due to the non-identically distributed data across the network. Though there have been benchmarks against the heterogeneous FL, e.g., FedProx, the prior studies lack formalization and it remains an open problem. In this work, we formalize the system-heterogeneous FL problem and propose a new algorithm, called FedLGA, which addresses this problem by bridging the divergence of local model updates via gradient approximation. To achieve this, FedLGA provides an alternated Hessian estimation method, which only requires extra linear complexity on the aggregator. Theoretically, we show that with a device-heterogeneous ratio ρ, FedLGA achieves convergence rates on non-i.i.d distributed FL training data against non-convex optimization problems for O (1+ρ)

show abstract

“…In FedAvg, FL clients run multiple epochs of SGD before sending their locally computed gradients to the server, which updates the global model accordingly. Beyond FedAvg, other optimization mechanisms have been proposed to improve on convergence and efficiency aspects [45].…”

Section: Federated Learningmentioning

confidence: 99%

Analysis and Evaluation of Synchronous and Asynchronous FLchain

Wilhelmi¹,

Giupponi²,

Dini³

2021

Preprint

View full text Add to dashboard Cite

Motivated by the heterogeneous nature of devices participating in large-scale Federated Learning (FL) optimization, we focus on an asynchronous server-less FL solution empowered by Blockchain (BC) technology. In contrast to mostly adopted FL approaches, which assume synchronous operation, we advocate an asynchronous method whereby model aggregation is done as clients submit their local updates. The asynchronous setting fits well with the federated optimization idea in practical large-scale settings with heterogeneous clients. Thus, it potentially leads to higher efficiency in terms of communication overhead and idle periods. To evaluate the learning completion delay of BC-enabled FL, we provide an analytical model based on batch service queue theory. Furthermore, we provide simulation results to assess the performance of both synchronous and asynchronous mechanisms. Important aspects involved in the BC-enabled FL optimization, such as the network size, link capacity, or user requirements, are put together and analyzed. As our results show, the synchronous setting leads to higher prediction accuracy than the asynchronous case. Nevertheless, asynchronous federated optimization provides much lower latency in many cases, thus becoming an appealing FL solution when dealing with large data sets, tough timing constraints (e.g., near-real-time applications), or highly varying training data.

show abstract

Adaptive Federated Optimization

Cited by 157 publications

References 17 publications

SPIDER: Searching Personalized Neural Architecture for Federated Learning

SPIDER: Searching Personalized Neural Architecture for Federated Learning

FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation

Analysis and Evaluation of Synchronous and Asynchronous FLchain

Contact Info

Product

Resources

About