With the rise in technology, a huge volume of data is being processed using data mining, especially in the healthcare sector. Usually, medical data consist of a lot of personal data, and third parties utilize it for the data mining process. Perturbation in health care data highly aids in preventing intruders from utilizing the patient's privacy. One of the challenges in data perturbation is managing data utility and privacy protection. Medical data mining has certain special properties compared with other data mining fields. Hence, in this work, the machine learning (ML) based perturbation approach is introduced to provide more privacy to healthcare data. Here, clustering and IGDP-3DR processes are applied to improve healthcare privacy preservation. Initially, the dataset is pre-processed using data normalization. Then, the dimensionality is reduced by SVD with PCA (singular value decomposition with Principal component analysis). Then, the clustering process is performed by IFCM (Improved Fuzzy C means). The high-dimensional data are divided into several segments by IFCM, and every partition is set as a cluster. Then, improved Geometric Data Perturbation (IGDP) is used to perturb the clustered data. IGDP is a combination of GDP with 3D rotation (3DR). Finally, the perturbed data are classified using a machine learning (ML) classifier -kernel Support Vector Machine-Horse Herd Optimization (KSVM-HHO) to classify the perturbed data and ensure better accuracy. The overall evaluation of the proposed KSVM-HHO is carried out in the Python platform. The performance of the IGDP-KSVM-HHO is compared over the two benchmark datasets, Wisconsin prognostic breast cancer (WBC) and Pima Indians Diabetes (PID) dataset. For the WBC dataset, the proposed method obtains an overall accuracy of 98.08% perturbed data, and for the PID dataset, the proposed method obtains an overall accuracy of 98.04%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.