The publication of a patient's dataset is essential for various medical investigations and decision-making. Currently, significant focus has been established to protect privacy during data publishing. The existing privacy models for multiple sensitive attributes do not concentrate on the correlation among the attributes, which in turn leads to much utility loss. An efficient model Heap Bucketization-anonymity (HBA) has been proposed to balance privacy and utility with multiple sensitive attributes. The Heap Bucketization-anonymity model used anatomization to vertically partition the dataset into 1. Quasi-identifier table and 2. Sensitive attribute table. The quasi-identifier is anonymized by implementing k-anonymity and slicing and the sensitive attributes are anonymized by applying slicing and Heap Bucketization. The metrics Normalized Certainty Penalty and KL-divergence have been used to compute the utility loss in the patient dataset. The experimental results show that the HB-anonymity can significantly achieve high privacy with less utility loss than other existing models. The HBanonymity model not only balances the utility and privacy also eradicates the i) background knowledge attack, ii) quasiidentifier attack iii) membership attack, iv) non-membership attack and v) fingerprint correlation attack.
Privacy of the individuals plays a vital role when a dataset is disclosed in public. Privacy-preserving data publishing is a process of releasing the anonymized dataset for various purposes of analysis and research. The data to be published contain several sensitive attributes such as diseases, salary, symptoms, etc. Earlier, researchers have dealt with datasets considering it would contain only one record for an individual [1:1 dataset], which is uncompromising in various applications. Later, many researchers concentrate on the dataset, where an individual has multiple records [1:M dataset]. In the paper, a model f-slip was proposed that can address the various attacks such as Background Knowledge (bk) attack, Multiple Sensitive attribute correlation attack (MSAcorr), Quasi-identifier correlation attack(QIcorr), Non-membership correlation attack(NMcorr) and Membership correlation attack(Mcorr) in 1:M dataset and the solutions for the attacks. In f-slip, the anatomization was performed to divide the table into two subtables consisting of i) quasi-identifier and ii) sensitive attributes. The correlation of sensitive attributes is computed to anonymize the sensitive attributes without breaking the linking relationship. Further, the quasi-identifier table was divided and k-anonymity was implemented on it. An efficient anonymization technique, frequency-slicing (f-slicing), was also developed to anonymize the sensitive attributes. The f-slip model is consistent as the number of records increases. Extensive experiments were performed on a real-world dataset Informs and proved that the f-slip model outstrips the state-of-the-art techniques in terms of utility loss, efficiency and also acquires an optimal balance between privacy and utility.
Privacy of the individuals plays a vital role when a dataset is disclosed in public. Privacy-preserving data publishing is a process of releasing the anonymized dataset for various purposes of analysis and research. The data to be published contain several sensitive attributes such as diseases, salary, symptoms, etc. Earlier, researchers have dealt with datasets considering it would contain only one record for an individual [1:1 dataset], which is uncompromising in various applications. Later, many researchers concentrate on the dataset, where an individual has multiple records [1:M dataset]. In the paper, a model f-slip was proposed that can address the various attacks such as Background Knowledge (bk) attack, Multiple Sensitive attribute correlation attack (MSAcorr), Quasi-identifier correlation attack(QIcorr), Non-membership correlation attack(NMcorr) and Membership correlation attack(Mcorr) in 1:M dataset and the solutions for the attacks. In f -slip, the anatomization was performed to divide the table into two subtables consisting of i) quasi-identifier and ii) sensitive attributes. The correlation of sensitive attributes is computed to anonymize the sensitive attributes without breaking the linking relationship. Further, the quasi-identifier table was divided and k-anonymity was implemented on it. An efficient anonymization technique, frequency-slicing (f-slicing), was also developed to anonymize the sensitive attributes. The f -slip model is consistent as the number of records increases. Extensive experiments were performed on a real-world dataset Informs and proved that the f -slip model outstrips the state-of-the-art techniques in terms of utility loss, efficiency and also acquires an optimal balance between privacy and utility.
Big data deals with massive amounts of data with various characteristics and intricate structures. The vast amount of data collection in big data has led to lots of security and privacy threats. Big data evolution and the need for security and privacy in big data have been covered in the study. Big data taxonomy framework, the privacy laws, and acts have also been analyzed and studied. Various privacy-preserving data publishing models and their attack models have been thoroughly studied under the categories of 1) record linkage model, 2) attribute linkage model, 3) table linkage model, and 4) probabilistic model. Furthermore, the trade-off between privacy and utility, future directions, and inference from the study have been summarized. The study gives insights into various techniques in privacy-preserving data publishing to address the problems related to privacy in big data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.