In recent years, a number of researchers have concentrated on medical data analytics because machine intelligence in medical diagnosis is a new trend for enormous medical applications. Generally, medical datasets are massive in size, so traditional classifiers suffered from overfitting and under-fitting problem of training set. In this paper, Gradient Descent Logistic Regression (GDLR) classification method is proposed for medical data classification. The Pearson Correlation Coefficient (PCC) is used to calculate the correlation between the features. After that, Random Forest (RF) algorithm ranks the features and selects the most relevant features to improve performance of the medical data classification. The regression technique processes the features effective and analyse the feature importance based on the weight values. The Random Forest (RF) assigns the features importance in the tree structure. The random forest is used to select the features and features are applied for the GDLR to classify effectively. The GDLR method further analysis the features for effectively analysis the feature importance based on the weight values and more relevant features are identified than the RF. The experimental analysis demonstrated that the performance of GDLR algorithm achieved better than traditional methods Neural Network for Threshold Selection (NNTS) and Mean Selection (MS). The accuracy of the proposed GDLR method achieved as 97.5% in the Hepatitis dataset, while existing mean selection method has the accuracy of 82.58%.
The process of discovering interesting and previously unknown, but potentially useful patterns from large spatial datasets is called spatial data mining. Extracting interesting and useful patterns from spatial datasets is more difficult than extracting the corresponding patterns from traditional numeric and categorical data due to the complication of spatial data types, spatial associations and correlation between them. Spatial Information may not only include spatial or geographical information or symbols but can also contain personal information like a person's name, mobile phone numbers linked with their address. Spatial data mining operation on such databases may not only extract required interesting patterns but may also reveal the sensitive and personal information which are associated with spatial database. An intruder can pose challenges to individual privacy by extracting the spatial data mining results or patterns and linking the data in various situations, previous knowledge and other information obtained from heterogeneous sources. This paper focus on various possibilities through which privacy of an individual can be easily violated in spatial data mining operation. Different techniques are proposed which can preserve the privacy of an individual in spatial data mining operation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.