2020
DOI: 10.48550/arxiv.2010.13723
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimal Client Sampling for Federated Learning

Abstract: It is well understood that client-master communication can be a primary bottleneck in Federated Learning. In this work, we address this issue with a novel client subsampling scheme, where we restrict the number of clients allowed to communicate their updates back to the master node. In each communication round, all participated clients compute their updates, but only the ones with "important" updates communicate back to the master. We show that importance can be measured using only the norm of the update and w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
39
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(40 citation statements)
references
References 19 publications
1
39
0
Order By: Relevance
“…7. However, Ranklist-Multi-UCB seems 2) Large number of clients (10) per round: For ten clients per round, Fig. 8 shows that the overall performance of all three policies is similar to that of FedAvg with five clients per round.…”
Section: Experiments Resultsmentioning
confidence: 92%
“…7. However, Ranklist-Multi-UCB seems 2) Large number of clients (10) per round: For ten clients per round, Fig. 8 shows that the overall performance of all three policies is similar to that of FedAvg with five clients per round.…”
Section: Experiments Resultsmentioning
confidence: 92%
“…Partial participation is a necessity in the cross-device regime where the training is performed over a very large number of clients (i.e., M is very large) most of which will only participate in the entire training procedure at most once. Sampling of clients to form a cohort can be done adaptively so as to choose the most informative clients (Chen et al, 2020).…”
Section: Ingredients Of Successful Federated Learning Methodsmentioning
confidence: 99%
“…Specifically, clients with "important" data would have higher probabilities to be sampled in each round. For example, existing works use clients' local gradient information (e.g., [25]- [27]) or local losses (e.g., [28]) to measure the importance of clients' data. However, these schemes did not consider the speed of error convergence with respect to wall-clock time, especially the straggling effect due to heterogeneous transmission delays.…”
Section: Related Workmentioning
confidence: 99%
“…One effective way of speeding up the convergence with respect to the number of training rounds is to choose clients according to some sampling distribution where "important" clients have high probabilities [21]- [24]. For example, recent works adopted importance sampling approaches based on clients' statistical property [25]- [28]. However, their sampling schemes did not account for the heterogeneous physical time in each round, especially under straggling circumstances.…”
Section: Introductionmentioning
confidence: 99%