Federated Learning with Incrementally Aggregated Gradients

Mitra, Arindam; Jaafar, Rayana H.; Pappas, George J.; Hassani, Hamed

doi:10.1109/cdc45484.2021.9683443

Cited by 13 publications

(29 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…SGD with biased noise. Many algorithms can be viewed as SGD with structured but potentially biased noise, including SGD with (biased) compression (Stich et al, 2018;Gorbunov et al, 2020), delayed SGD (Mania et al, 2017;Dutta et al, 2018), local SGD (Stich, 2019), federated learning methods (Karimireddy et al, 2020;Yuan & Ma, 2020;Mitra et al, 2021;Nguyen et al, 2022), decentralized optimization methods (Yu et al, 2019;Koloskova et al, 2020), and many others. Convergence analyses for such methods often use techniques like perturbed iterate analysis (Mania et al, 2017).…”

Section: Related Workmentioning

confidence: 99%

Convergence of Gradient Descent with Linearly Correlated Noise and Applications to Differentially Private Learning

Koloskova¹,

McKenna²,

Charles³

et al. 2023

Preprint

View full text Add to dashboard Cite

We study stochastic optimization with linearly correlated noise. Our study is motivated by recent methods for optimization with differential privacy (DP), such as DP-FTRL, which inject noise via matrix factorization mechanisms. We propose an optimization problem that distils key facets of these DP methods and that involves perturbing gradients by linearly correlated noise. We derive improved convergence rates for gradient descent in this framework for convex and nonconvex loss functions. Our theoretical analysis is novel and might be of independent interest. We use these convergence rates to develop new, effective matrix factorizations for differentially private optimization, and highlight the benefits of these factorizations theoretically and empirically.

show abstract

Section: Related Workmentioning

confidence: 99%

Convergence of Gradient Descent with Linearly Correlated Noise and Applications to Differentially Private Learning

Koloskova¹,

McKenna²,

Charles³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…As one of the earliest methods, FedAvg has been shown to effectively reduce the communication cost (McMahan et al, 2017). An increasing number of variants of FedAvg have been further proposed to address the issues such as the slow convergence and client drift via regularization (Li et al, 2020;Acar et al, 2021), variance reduction (Mitra et al, 2021;Karimireddy et al, 2020), proximal splitting (Pathak & Wainwright, 2020) and adaptive optimization (Reddi et al, 2020). In the homogeneous setting, FedAvg is relevant to local SGD, and has been analyzed in Stich 2019; Wang & Joshi 2018;Stich & Karimireddy 2019;Basu et al 2019.…”

Section: Related Workmentioning

confidence: 99%

“…In the homogeneous setting, FedAvg is relevant to local SGD, and has been analyzed in Stich 2019; Wang & Joshi 2018;Stich & Karimireddy 2019;Basu et al 2019. In the heterogeneous setting, Li et al 2020;Mitra et al 2021;Li et al 2019;Khaled et al 2019 provided the convergence analysis of their methods.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Communication-Efficient Federated Hypergradient Computation via Aggregated Iterative Differentiation

Xiao¹,

Ji²

2023

Preprint

View full text Add to dashboard Cite

Federated bilevel optimization has attracted increasing attention due to emerging machine learning and communication applications. The biggest challenge lies in computing the gradient of the upper-level objective function (i.e., hypergradient) in the federated setting due to the nonlinear and distributed construction of a series of global Hessian matrices. In this paper, we propose a novel communication-efficient federated hypergradient estimator via aggregated iterative differentiation (AggITD). AggITD is simple to implement and significantly reduces the communication cost by conducting the federated hypergradient estimation and the lower-level optimization simultaneously. We show that the proposed AggITD-based algorithm achieves the same sample complexity as existing approximate implicit differentiation (AID)based approaches with much fewer communication rounds in the presence of data heterogeneity. Our results also shed light on the great advantage of ITD over AID in the federated/distributed hypergradient estimation. This differs from the comparison in the non-distributed bilevel optimization, where ITD is less efficient than AID. Our extensive experiments demonstrate the great effectiveness and communication efficiency of the proposed method.

show abstract

“…Federated learning: At the core of federated learning is the prevailing FedAvg algorithm and its variants (McMahan et al, 2017;Karimireddy et al, 2019;Mitra et al, 2021;Acar et al, 2021;Stich, 2018;Yu et al, 2019;Qu et al, 2020) to address the communication efficiency and the data privacy concerns. We review literature with a focus on the analysis of the linear speedup for convergence.…”

Section: Related Workmentioning

confidence: 99%

LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data

et al. 2020

View full text Add to dashboard Cite

Intensive care data are valuable for improvement of health care, policy making and many other purposes. Vast amount of such data are stored in different locations, on many different devices and in different data silos. Sharing data among different sources is a big challenge due to regulatory, operational and security reasons. One potential solution is federated machine learning, which is a method that sends machine learning algorithms simultaneously to all data sources, trains models in each source and aggregates the learned models. This strategy allows utilization of valuable data without moving them. One challenge in applying federated machine learning is the possibly different distributions of data from diverse sources. To tackle this problem, we proposed an adaptive boosting method named LoAda-Boost that increases the efficiency of federated machine learning. Using intensive care unit data from hospitals, we investigated the performance of learning in IID and non-IID data distribution scenarios, and showed that the proposed LoAdaBoost method achieved higher predictive accuracy with lower computational complexity than the baseline method.

show abstract

Federated Learning with Incrementally Aggregated Gradients

Cited by 13 publications

References 9 publications

Convergence of Gradient Descent with Linearly Correlated Noise and Applications to Differentially Private Learning

Convergence of Gradient Descent with Linearly Correlated Noise and Applications to Differentially Private Learning

Communication-Efficient Federated Hypergradient Computation via Aggregated Iterative Differentiation

LoAdaBoost: Loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data

Contact Info

Product

Resources

About