Decentralized and Adaptive K-Means Clustering for Non-IID Data Using HyperLogLog Counters

Soliman, Amira; Girdzijauskas, Šarūnas; Bouguelia, Mohamed-Rafik; Pashami, Sepideh; Nowaczyk, Sławomir

doi:10.1007/978-3-030-47426-3_27

Cited by 10 publications

(5 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An alternative private distributed clustering approach is to select a subset of local points (representatives) and apply clustering over them. Soliman et al [21] proposed running the K-Means algorithm locally on each client and using HyperLogLog counters to share the centroids and the approximate number of observations per centroid in a decentralized fashion with the other clients. Then a weighted averaging over all the centroids is done to find the global centroids.…”

Section: Privacy Preserving Distributed K-meansmentioning

confidence: 99%

Efficient Privacy Preserving Distributed K-Means for Non-IID Data

Brandão

Mendes

Vilela

2021

Advances in Intelligent Data Analysis XIX

View full text Add to dashboard Cite

Privacy is becoming a crucial requirement in many machine learning systems. In this paper we introduce an efficient and secure distributed K-Means algorithm, that is robust to non-IID data. The base idea of our proposal consists in each client computing the K-Means algorithm locally, with a variable number of clusters. The server will use the resultant centroids to apply the K-Means algorithm again, discovering the global centroids. To maintain the client's privacy, homomorphic encryption and secure aggregation is used in the process of learning the global centroids. This algorithm is efficient and reduces transmission costs, since only the local centroids are used to find the global centroids. In our experimental evaluation, we demonstrate that our strategy achieves a similar performance to the centralized version even in cases where the data follows an extreme non-IID form.

show abstract

Section: Privacy Preserving Distributed K-meansmentioning

confidence: 99%

Efficient Privacy Preserving Distributed K-Means for Non-IID Data

Brandão

Mendes

Vilela

2021

Advances in Intelligent Data Analysis XIX

View full text Add to dashboard Cite

show abstract

“…• To aggregate the information learned from each of the clients into a global model using classical federated aggregation operators such as: FedAvg, weighted FedAvg [24] and an aggregation operator for the adaptation of the k-means algorithm to a federated setting [25].…”

Section: Software Functionalitiesmentioning

confidence: 99%

Federated Learning and Differential Privacy: Software tools analysis, the Sherpa.ai FL framework and methodological guidelines for preserving data privacy

Rodríguez-Barroso,

Stipcich,

Jiménez-López

et al. 2020

Preprint

View full text Add to dashboard Cite

The high demand of artificial intelligence services at the edges that also preserve data privacy has pushed the research on novel machine learning paradigms that fit those requirements. Federated learning has the ambition to protect data privacy through distributed learning methods that keep the data in their data silos. Likewise, differential privacy attains to improve the protection of data privacy by measuring the privacy loss in the communication among the elements of federated learning. The prospective matching of federated learning and differential privacy to the challenges of data privacy protection has caused the release of several software tools that support their functionalities, but they lack of the needed unified vision for those techniques, and a methodological workflow that support their use. Hence, we present the Sherpa.ai Federated Learning framework that is built upon an holistic view of federated learning and differential privacy. It results from the study of how to adapt the machine learning paradigm to federated learning, and the definition of methodological guidelines for developing artificial intelligence services based on federated learning and differential privacy. We show how to follow the methodological guidelines with the Sherpa.ai Federated Learning framework by means of a classification and a regression use cases.

show abstract

“…Although there are many works concentrating on the distributed learning with clustering as methods to compute local model [17,18,19,20,21], few works have considered the performance aspects of ML models in an inherently distributed network, especially when there are sub-groups of agents receiving data from different phenomena. At some timestamps T i , some agents may have learned their provisional global models or received model from their neighbours.…”

Section: Introductionmentioning

confidence: 99%

Evaluation Mechanism for Decentralized Collaborative Pattern Learning in Heterogeneous Vehicular Networks

Qiao

Qiu

Tan

et al. 2023

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

Collaborative machine learning, especially Federated Learning (FL), is widely used to build high-quality Machine Learning (ML) models in the Internet of Vehicles (IoV). In this paper, we study the performance evaluation problem in an inherently heterogeneous IoV, where the final models across the network are not identical and are computed on different standards. Previous studies assume that local agents are receiving data from the same phenomenon, and a same final model is fitted to them. However, this "one model fits all" approach leads to a biased performance evaluation of individual agents. We propose a general approach to measure the performance of individual agents, where the common knowledge and correlation between different agents are explored. Experimental results indicate that our evaluation scheme is efficient in these settings.

show abstract

Decentralized and Adaptive K-Means Clustering for Non-IID Data Using HyperLogLog Counters

Cited by 10 publications

References 16 publications

Efficient Privacy Preserving Distributed K-Means for Non-IID Data

Efficient Privacy Preserving Distributed K-Means for Non-IID Data

Federated Learning and Differential Privacy: Software tools analysis, the Sherpa.ai FL framework and methodological guidelines for preserving data privacy

Evaluation Mechanism for Decentralized Collaborative Pattern Learning in Heterogeneous Vehicular Networks

Contact Info

Product

Resources

About