Background
In healthcare area, big data, if integrated with machine learning, enables health practitioners to predict the result of a disorder or disease more accurately. In Autistic Spectrum Disorder (ASD), it is important to screen the patients to enable them to undergo proper treatments as early as possible. However, difficulties may arise in predicting ASD occurrences accurately, mainly caused by human errors. Data mining, if embedded into health screening practice, can help to overcome the difficulties. This study attempts to evaluate the performance of six best classifiers, taken from existing works, at analysing ASD screening training dataset.
Result
We tested Naive Bayes, Logistic Regression, KNN, J48, Random Forest, SVM, and Deep Neural Network algorithms to ASD screening dataset and compared the classifiers’ based on significant parameters; sensitivity, specificity, accuracy, receiver operating characteristic, area under the curve, and runtime, in predicting ASD occurrences. We also found that most of previous studies focused on classifying health-related dataset while ignoring the missing values which may contribute to significant impacts to the classification result which in turn may impact the life of the patients. Thus, we addressed the missing values by implementing imputation method where they are replaced with the mean of the available records found in the dataset.
Conclusion
We found that J48 produced promising results as compared to other classifiers when tested in both circumstances, with and without missing values. Our findings also suggested that SVM does not necessarily perform well for small and simple datasets. The outcome is hoped to assist health practitioners in making accurate diagnosis of ASD occurrences in patients.