Sai Praneeth Karimireddy scite author profile

Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and other constraints that are not primary considerations in other problem settings. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms through concrete examples and practical implementation, with a focus on conducting effective simulations to infer real-world performance. The goal of this work is not to survey the current literature, but to inspire researchers and practitioners to design federated learning algorithms that can be used in various practical applications.

show abstract

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication

Stich¹,

Karimireddy²

2019

Preprint

View full text Add to dashboard Cite

We analyze (stochastic) gradient descent (SGD) with delayed updates on smooth quasi-convex and non-convex functions and derive concise, non-asymptotic, convergence rates. We show that the rate of convergence in all cases consists of two terms: (i) a stochastic term which is not affected by the delay, and (ii) a higher order deterministic term which is only linearly slowed down by the delay. Thus, in the presence of noise, the effects of the delay become negligible after a few iterations and the algorithm converges at the same optimal rate as standard SGD. This result extends a line of research that showed similar results in the asymptotic regime or for strongly-convex quadratic functions only.We further show similar results for SGD with more intricate form of delayed gradients-compressed gradients under error compensation and for local SGD where multiple workers perform local steps before communicating with each other. In all of these settings, we improve upon the best known rates.These results show that SGD is robust to compressed and/or delayed stochastic gradient updates. This is in particular important for distributed parallel implementations, where asynchronous and communication efficient methods are the key to achieve linear speedups for optimization with multiple devices.

show abstract

Byzantine-Robust Decentralized Learning via Self-Centered Clipping

He¹,

Karimireddy²,

Jäggi³

2022

Preprint

View full text Add to dashboard Cite

In this paper, we study the challenging task of Byzantine-robust decentralized training on arbitrary communication graphs. Unlike federated learning where workers communicate through a server, workers in the decentralized environment can only talk to their neighbors, making it harder to reach consensus. We identify a novel dissensus attack in which few malicious nodes can take advantage of information bottlenecks in the topology to poison the collaboration. To address these issues, we propose a Self-Centered Clipping (SCCLIP) algorithm for Byzantine-robust consensus and optimization, which is the first to provably converge to a O(δ max ζ 2 /γ 2 ) neighborhood of the stationary point for non-convex objectives under standard assumptions. Finally, we demonstrate the encouraging empirical performance of SC-CLIP under a large number of attacks.

show abstract

FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

Terrail¹,

Ayed²,

Cyffers³

et al. 2022

Preprint

View full text Add to dashboard Cite

Weight Erosion: An Update Aggregation Scheme for Personalized Collaborative Machine Learning

Grimberg

Hartley

Jäggi

et al. 2020

View full text Add to dashboard Cite

Background. In medicine and other applications, the copying and sharing of data is impractical for a range of well-considered reasons. With federated learning (FL) techniques, machine learning models can be trained on data spread across several locations without such copying and sharing. While good privacy guarantees can often be made, FL does not automatically incentivize participation and the resulting model can suffer if data is non-identically distributed (non-IID) across locations. Model personalization is a way of addressing these concerns. Methods. In this study, we introduce Weight Erosion: an SGD-based gradient aggregation scheme for personalized collaborative ML. We evaluate this scheme on a binary classification task in the Titanic data set. Findings. We demonstrate that the novel Weight Erosion scheme can outperform two baseline FL aggregation schemes on a classification task, and is more resistant to over-fitting and non-IID data sets.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.