2023
DOI: 10.3390/s23031152
|View full text |Cite
|
Sign up to set email alerts
|

Performance Enhancement in Federated Learning by Reducing Class Imbalance of Non-IID Data

Abstract: Due to the distributed data collection and learning in federated learnings, many clients conduct local training with non-independent and identically distributed (non-IID) datasets. Accordingly, the training from these datasets results in severe performance degradation. We propose an efficient algorithm for enhancing the performance of federated learning by overcoming the negative effects of non-IID datasets. First, the intra-client class imbalance is reduced by rendering the class distribution of clients close… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 23 publications
0
9
0
Order By: Relevance
“…Researchers in [26] demonstrate that the FedAvg algorithm can achieve satisfactory accuracy even in scenarios where the data is non-ID among applicants. Specifically, it is observed [27] that a CNN-trained model for the CIFAR-10 dataset which exhibits FedAvg gives 51% less accuracy as compared to a centralized trained model. This decrease in accuracy is quantified using the Earth Mover's Distance (EMD), which measures the disparity between the data distribution of participants in FL and the overall population distribution.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Researchers in [26] demonstrate that the FedAvg algorithm can achieve satisfactory accuracy even in scenarios where the data is non-ID among applicants. Specifically, it is observed [27] that a CNN-trained model for the CIFAR-10 dataset which exhibits FedAvg gives 51% less accuracy as compared to a centralized trained model. This decrease in accuracy is quantified using the Earth Mover's Distance (EMD), which measures the disparity between the data distribution of participants in FL and the overall population distribution.…”
Section: Related Workmentioning
confidence: 99%
“…While this approach ensures theoretical convergence for defined objectives but is limited to convex objectives or faces scalability challenges when dealing with extensive networks. Agnostic FL [27] provides a more structured substitute by optimizing the central model by using a special technique (minimax optimization) to suit any desired distribution formed by a mix of client allocations.…”
Section: Statistical Challengesmentioning
confidence: 99%
“…The first type refers to a client's class distribution, which differs from a uniform distribution in terms of how much data are distributed among classes. When a disparity occurs between the class distributions of various customers, this is known as intra-client class imbalance [28]. Some studies are presented here to solve the first type of quantity deviation.…”
Section: Quantity Skew (Imbalanced Data)mentioning
confidence: 99%
“…A prior study introduced FLY-SMOTE, a fresh method that generates synthetic data for the minority class in supervised learning tasks using a modified SMOTE method, re-balancing the data in various non-IID contexts [37]. The last paper in this category suggested a way to improve FL by solving the heterogeneity of data, where the difference in size between classes can be solved by making the distribution of data in class close to the uniform distribution [28]. Clients are chosen only if their class distribution is close to uniform to participate in training and mitigate user imbalance issues [28].…”
Section: Quantity Skew (Imbalanced Data)mentioning
confidence: 99%
See 1 more Smart Citation