Parkinson’s Disease and Adductor-type Spasmodic Dysphonia are two neurological disorders that greatly decrease the quality of life of millions of patients worldwide. Despite this great diffusion, the related diagnoses are often performed empirically, while it could be relevant to count on objective measurable biomarkers, among which researchers have been considering features related to voice impairment that can be useful indicators but that can sometimes lead to confusion. Therefore, here, our purpose was aimed at developing a robust Machine Learning approach for multi-class classification based on 6373 voice features extracted from a convenient voice dataset made of the sustained vowel/e/ and an ad hoc selected Italian sentence, performed by 111 healthy subjects, 51 Parkinson’s disease patients, and 60 dysphonic patients. Correlation, Information Gain, Gain Ratio, and Genetic Algorithm-based methodologies were compared for feature selection, to build subsets analyzed by means of Naïve Bayes, Random Forest, and Multi-Layer Perceptron classifiers, trained with a 10-fold cross-validation. As a result, spectral, cepstral, prosodic, and voicing-related features were assessed as the most relevant, the Genetic Algorithm performed as the most effective feature selector, while the adopted classifiers performed similarly. In particular, a Genetic Algorithm + Naïve Bayes approach brought one of the highest accuracies in multi-class voice analysis, being 95.70% for a sustained vowel and 99.46% for a sentence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.