Healthcare industry is undergoing changes at a tremendous rate due to healthcare innovations. Predictive analytics is increasingly being used to diagnose the patient’s ailments and provide actionable insights into already existing healthcare data. The paper looks at a decision support system for determining the health status of the foetus from cardiotographic data using deep learning neural networks. The foetal health records are classified as normal, suspect and pathological. As the multiclass cardiotographic datset of the foetus shows a high degree of imbalance a weighted deep neural network is applied. To overcome the accuracy paradox due to the multiclass imbalance, relevant metrics such as the sensitivity, specificity, F1 Score and Gmean are used to measure the performance of the classifier rather than accuracy. The metrics are applied to the individual classes to ensure that the positive cases are identified correctly. The weighted DNN based classifier is able to classify the positive instances with Gmean score of 91% which is better than than the SVM classifier.
Abstract:Predicting Patients health is a critical task in the Healthcare Industry. Healthcare datasets show a high degree of imbalance especially for rare diseases. The current work aims at predicting the post operative survival rate in thoracic surgery datasets. The dataset exhibits data imbalance with around 15% positive cases and remaining 85% negative cases. The commonly applicable machine learning techniques for prediction score poorly in predicting the positive cases in spite of high accuracy of the predictions for the negative cases. We use SMOTE (synthetic minority oversampling technique) to reduce the degree of imbalance and increase the positive samples proportion before the application of the following classifiers: Naive Bayes, Neural Networks, Random Forest, Boosting algorithms -Adaboost, Extreme Gradient boosting and Support Vector Machines and examine the results. The study shows that SVM and Naïve Bayes show significantly better performance on the imbalanced datasets than other models using synthetic datasets than under normal conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.