2021
DOI: 10.1007/978-3-030-74251-5_35
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Privacy Preserving Distributed K-Means for Non-IID Data

Abstract: Privacy is becoming a crucial requirement in many machine learning systems. In this paper we introduce an efficient and secure distributed K-Means algorithm, that is robust to non-IID data. The base idea of our proposal consists in each client computing the K-Means algorithm locally, with a variable number of clusters. The server will use the resultant centroids to apply the K-Means algorithm again, discovering the global centroids. To maintain the client's privacy, homomorphic encryption and secure aggregatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 27 publications
0
3
0
Order By: Relevance
“…ii) Client Initialization First: Since the initial centroid selection starts closer to the data source, this approach has shown promise in enhancing overall performance. Brandao et al [13] suggested an approach where each client first locally determines the optimal ∼ K within the range [1, K] using the Silhouette metric [14]. Subsequently, they share the best ∼ K initial centroids with the server.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…ii) Client Initialization First: Since the initial centroid selection starts closer to the data source, this approach has shown promise in enhancing overall performance. Brandao et al [13] suggested an approach where each client first locally determines the optimal ∼ K within the range [1, K] using the Silhouette metric [14]. Subsequently, they share the best ∼ K initial centroids with the server.…”
Section: Related Workmentioning
confidence: 99%
“…Dennis et al [17] introduced a one-shot federated clustering scheme based on K-means (known as K-FED). However, a practical drawback shared with Brandao's [13] method is the substantial computational burden placed on edge devices, which typically have constrained computing capabilities. Similar to Dennis et al's [17] approach, we start the initialization for federated K-means at the edge clients, where the data resides.…”
Section: Related Workmentioning
confidence: 99%
“…Negative database-based methods have two problemsconversion to negative database is not possible for all kinds of data and there is huge overhead on data owner side for negative database construction. Brando et al (2021) [18] proposed a distributed privacy preserving K-mean algorithm. Client compute K-mean for their data locally and send the centroids to a server.…”
Section: Related Workmentioning
confidence: 99%