2019 53rd Asilomar Conference on Signals, Systems, and Computers 2019
DOI: 10.1109/ieeeconf44664.2019.9049023
|View full text |Cite
|
Sign up to set email alerts
|

FedDANE: A Federated Newton-Type Method

Abstract: Federated learning aims to jointly learn statistical models over massively distributed remote devices. In this work, we propose FedDANE, an optimization method that we adapt from DANE [9, 10], a method for classical distributed optimization, to handle the practical constraints of federated learning. We provide convergence guarantees for this method when learning over both convex and non-convex functions. Despite encouraging theoretical results, we find that the method has underwhelming performance empirically.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
615
0
6

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 367 publications
(624 citation statements)
references
References 8 publications
3
615
0
6
Order By: Relevance
“…Users may not participate sufficiently in the FL process for several reasons, such as low battery power, poor connection, and so on. The low participation issue during FL model training has been highlighted in several studies 2,68 …”
Section: Discussionmentioning
confidence: 99%
“…Users may not participate sufficiently in the FL process for several reasons, such as low battery power, poor connection, and so on. The low participation issue during FL model training has been highlighted in several studies 2,68 …”
Section: Discussionmentioning
confidence: 99%
“…Some of them originate from DANE [12], which is a classical optimization method that introduces a sequence of local subproblems to reduce client drift. Fed-DANE [13] is adapted from DANE by allowing partial device participation. Network-DANE [14] is developed for decentralized federated learning.…”
Section: B Training Variancementioning
confidence: 99%
“…There is another line of work which reduce inter-client variance to eliminate inconsistent update across clients. several approaches use control variates [10,11,18,20] or global gradient momentum [30] to reduce biases in client update. [5,13] apply STORM algorithm [4] to reduce variance caused by both server-level and client-level SGD procedures.…”
Section: Related Workmentioning
confidence: 99%
“…This feature allows the proposed method to achieve the same level of task-specific performance with fewer communication rounds. Moreover, while most of existing methods require additional requirements compared to FedAvg such as full participation [13,20,34], additional communication bandwidth [5,10,11,18,30,36], or storage costs on clients to store local states [1,11,16], FedAGM is completely free from any additional communication and memory overhead, which ensures the compatibility with large-scale and low-participation federated learning scenarios. The main contributions of this paper are summarized as follows.…”
Section: Introductionmentioning
confidence: 99%