Density peaks clustering is a novel and efficient density-based clustering algorithm. However, the problem of the sensitive information leakage and the associated security risk with the applications of clustering methods is rarely considered. To address the problem, we proposed differential privacypreserving density peaks' clustering based on the shared near neighbors similarity method in this paper. First, the Euclidean distance and the shared near neighbors similarity were combined to define the local density of a sample, and the Laplace noise was added to the local density and the shortest distance to protect privacy. Second, the process of cluster center selection was optimized to select the initial cluster centers based on the neighborhood information. Finally, each sample was assigned to the cluster as its nearest neighbor with higher local density. The experimental results on both the UCI and synthetic datasets show that compared with other algorithms, our method more effectively protects the data privacy and improves the quality of the clustering results.INDEX TERMS Privacy preservation, differential privacy, density peaks clustering algorithm, shared near neighbors similarity.
Cloud computing gives clients the convenience of outsourcing data calculations. However, it also brings the risk of privacy leakage, and datasets that process industrial IoT information have a high computational cost for clients. To address these problems, this paper proposes a secure grid-based density peaks clustering algorithm for a hybrid cloud environment. First, the client utilizes the homomorphic encryption algorithm to construct encrypted objects with client dataset. Second, the client uploads the encrypted data to the cloud servers to implement our security protocol. Finally, the cloud servers return the clustering results with the disturbance to the client. The experimental results on the UCI datasets and the smart power grid dataset reveal that the secure algorithm presented in this paper can improve upon the precision and efficiency of other clustering algorithms while also preserving user privacy. Moreover, it only performs encryption and removes the disturbance operation on the client, so that the client has lower computational complexity. Therefore, the secure clustering scheme proposed in this paper is applicable to industrial IoT big data and has high security and scalability.
Aiming at preventing the privacy disclosure of sensitive information, issues related to privacy protection in cloud computing have attracted the interest of researchers. To protect the privacy of users during clustering in a cloud computing environment, we present a privacy-preserving density peak clustering (PPDPC) algorithm that neither discloses personal privacy information nor leaks the cluster centers. Our scheme contains two steps of density peak clustering: First, a cloud service provider calculates the cluster centers without knowing each participant's private data and without disclosing any cluster center information to the other participants, and second, participant allocation is secure and every participant is prevented from identifying the other members of the same cluster. Security analysis and comparison experiments show that the proposed PPDPC algorithm not only obtains good accuracy with respect to density peak clustering but also resists collusion attacks even if the cloud service provider is collaborating with all except one participant. Both theoretical analysis and experimental results confirm the security and accuracy of our method. KEYWORDS cloud computing, data mining, density peak clustering, homomorphic encryption, privacy preservation INTRODUCTIONWith the rapid development of mobile social networks and computer technology, all types of mobile terminals and servers have started generating huge amounts of data at all times, presenting a serious challenge to the computing ability of enterprises. 1,2 Cloud computing technology, which is used to address this challenge, is growing. More and more enterprises are storing data in cloud servers to save economic costs; its powerful computing power is convenient for handling huge amounts of data. [3][4][5] Additionally, data mining technology can help users analyze and extract key value information from a large amount of data in scientific research and business applications. The analysis of these data allows the prediction of future development trends and directions. 6 Clustering, as one of the important research methods of data mining, aims to divide data objects into several clusters such that object similarity in a cluster is high, while the similarity between each cluster is low.In the process of clustering analysis, a large amount of user-based privacy data, such as geographical location, electricity consumption data, and spatiotemporal sensing data, is collected and analyzed. [7][8][9] The security and privacy of this data depends on the security of cloud services.The sensitive data is directly outsourced to the cloud server for calculation, at which point the user's privacy may be leaked if the cloud service provider is malicious or dishonest. If multiple users collude with each other, they combine their own information to calculate their respective distances and then calculate the cluster centers by distance. If user's privacy and cluster centers are disclosed, serious consequences can result.Therefore, the development of a data mining techn...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.