Background: The medical researchers are developing different non-invasive methods for early detection of Neurodegenerative Diseases (NDDs) when pharmacological interventions are still possible to further prevent the disease progression. The NDDs are associated with the degradation in the complex gait dynamics and motor activity. The classification of gait data using machine learning techniques can assist the physicians for early diagnosis of the neural disorder when clinical manifestation of the diseases is not yet apparent.
Aims: The present study was undertaken to classify the control and NDD subjects using decision trees based classifiers (Random Forest (RF), J48 and REPTree).
Methodology: The data used in the study comprises of 16 control, 20 Huntington’s Disease (HD), 15 Parkinson’s Disease (PD), and 13 Amyotrophic Lateral Sclerosis (ALS) subjects, which were taken from publicly available database from Physionet. The age range of control subjects was 20-74, HD subjects was 36-70, PD subjects was 44-80, and ALS subjects was 29-71. There were 13 attributes associated with the data. Important features/attributes of the data were selected using correlation feature selection - subset evaluation (cfs) method. Three tree based machine learning algorithms (RF, J48 and REPTree) were used to classify the control and NDD subjects. The performance of classifiers were evaluated using Precision, Recall, F-Measure, MAE and RMSE.
Results: In order to evaluate the performance of tree based classifiers, two different settings of data i.e. complete features and selected features were used. In classifying control vs HD subjects, RF provides the robust separation with classification accuracy of 84.79% using complete features and 83.94% using selected features. While in classifying control vs PD subjects, and control vs ALS subjects, RF also provides the best separation with classification accuracy of 86.51% and 94.95% respectively using complete features and 85.19% and 93.64% respectively using selected features.
Conclusion: The variability analysis of physiological signals provides a valuable non-invasive tool for quantifying the system of dynamics of healthy subjects and to examine the alternations in the controlling mechanism of these systems with aging and disease. It is concluded that selected features encode adequate information about neural control of the gait. Moreover, the selected features along with tree based machine learning algorithms can play a vital for early detection of NDDs, when pharmacological interventions are still possible.