Among the chronic nervous system diseases, Parkinson's disease (PD) is known for its progressiveness in impairing the speech ability, gait as well as complex muscle and nerve actions. Hence an early diagnosis of PD will help in reducing the symptoms. Telemedicine offers a cost-effective and convenient approach, and several studies have used dysphonic features to remotely detect PD. In this study, we have used a data set from Kaggle, which included voice measurements from 31 people of whom 23 were diagnosed with PD. The data set included 22 different attributes pertaining to voice measurements, including the pitch period entropy with 195 voice recordings for each of the individuals. In the data pre-processing, the correlated attributes were removed and we used 10 non-correlating attributes (< 0.7) along with individual status (0 and 1 for healthy and PD, respectively). The data set after preprocessing was split into 70:30 ratio and also ascertained that the number normal versus PD are in equal ratios in both the training and testing data sets, respectively. The data set was evaluated with four different supervised classification machine learning (ML) models, namely random forest, XGBoost, SVM and decision tree. The XGBoost classifier model was found to be highly efficient in precise classification of PD with an accuracy of 0.93.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.