Irrelevant feature in heart disease dataset affects the performance of binary classification model. Consequently, eliminating irrelevant and redundant feature (s) from training set with feature selection algorithm significantly improves the performance of classification model on heart disease detection. Sequential feature selection (SFS) is successful algorithm to improve the performance of classification model on heart disease detection and reduces the computational time complexity. In this study, sequential feature selection (SFS) algorithm is implemented for improving the classifier performance on heart disease detection by removing irrelevant features and training a model on optimal features. Furthermore, exhaustive and permutation based feature selection algorithm are implemented and compared with SFS algorithm. The implemented and existing feature selection algorithms are evaluated using real world Pima Indian heart disease dataset and result appears to prove that the SFS algorithm outperforms as compared to exhaustive and permutation based feature selection algorithm. Overall, the result looks promising and more effective heart disease detection model is developed with accuracy of 99.3%.
Heart disease is one of the causes for death throughout the world. Heart disease cannot be easily identified by the medical experts and practitioners as the detection of heart disease requires expertise and experience. Hence, developing better performing models for heart disease detection using machine-learning algorithms is crucial for detecting heart disease in an early stage. However, employing machine learning algorithm involves determining the relationship between the heart failure dataset features. In this study, correlation analysis is employed to identify the relationship among the heart failure dataset features and a predictive model for heart failure detection is developed with K-nearest neighbor (KNN). Pearson correlation is employed to identify the relationship between the features in the heart failure dataset and the effect of strong correlation to the target feature on the performance of K-nearest neighbor (KNN) model is analyzed. The experimental result shows that highly correlated feature significantly affected the performance of K-nearest neighbor (KNN) for heart failure detection. Finally, the performance of KNN is evaluated and result reveals that the model has acceptable level of performance with highest accuracy of 97.07% on heart failure prediction.
The first step in diagnosis of a breast cancer is the identification of the disease. Early detection of the breast cancer is significant to reduce the mortality rate due to breast cancer. Machine learning algorithms can be used in identification of the breast cancer. The supervised machine learning algorithms such as Support Vector Machine (SVM) and the Decision Tree are widely used in classification problems, such as the identification of breast cancer. In this study, a machine learning model is proposed by employing learning algorithms namely, the support vector machine and decision tree. The kaggle data repository consisting of 569 observations of malignant and benign observations is used to develop the proposed model. Finally, the model is evaluated using accuracy, confusion matrix precision and recall as metrics for evaluation of performance on the test set. The analysis result showed that, the support vector machine (SVM) has better accuracy and less number of misclassification rate and better precision than the decision tree algorithm. The average accuracy of the support vector machine (SVM) is 91.92 % and that of the decision tree classification model is 87.12 %.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.