Background Access to neurological care for Parkinson disease (PD) is a rare privilege for millions of people worldwide, especially in resource-limited countries. In 2013, there were just 1200 neurologists in India for a population of 1.3 billion people; in Africa, the average population per neurologist exceeds 3.3 million people. In contrast, 60,000 people receive a diagnosis of PD every year in the United States alone, and similar patterns of rising PD cases—fueled mostly by environmental pollution and an aging population—can be seen worldwide. The current projection of more than 12 million patients with PD worldwide by 2040 is only part of the picture given that more than 20% of patients with PD remain undiagnosed. Timely diagnosis and frequent assessment are key to ensure timely and appropriate medical intervention, thus improving the quality of life of patients with PD. Objective In this paper, we propose a web-based framework that can help anyone anywhere around the world record a short speech task and analyze the recorded data to screen for PD. Methods We collected data from 726 unique participants (PD: 262/726, 36.1% were women; non-PD: 464/726, 63.9% were women; average age 61 years) from all over the United States and beyond. A small portion of the data (approximately 54/726, 7.4%) was collected in a laboratory setting to compare the performance of the models trained with noisy home environment data against high-quality laboratory-environment data. The participants were instructed to utter a popular pangram containing all the letters in the English alphabet, “the quick brown fox jumps over the lazy dog.” We extracted both standard acoustic features (mel-frequency cepstral coefficients and jitter and shimmer variants) and deep learning–based embedding features from the speech data. Using these features, we trained several machine learning algorithms. We also applied model interpretation techniques such as Shapley additive explanations to ascertain the importance of each feature in determining the model’s output. Results We achieved an area under the curve of 0.753 for determining the presence of self-reported PD by modeling the standard acoustic features through the XGBoost—a gradient-boosted decision tree model. Further analysis revealed that the widely used mel-frequency cepstral coefficient features and a subset of previously validated dysphonia features designed for detecting PD from a verbal phonation task (pronouncing “ahh”) influence the model’s decision the most. Conclusions Our model performed equally well on data collected in a controlled laboratory environment and in the wild across different gender and age groups. Using this tool, we can collect data from almost anyone anywhere with an audio-enabled device and help the participants screen for PD remotely, contributing to equity and access in neurological care.
BACKGROUND Access to neurological care—especially for Parkinson's disease (PD)—is a rare privilege for millions of people worldwide, especially in developing countries. In 2013, there were just 1200 neurologists in India for a population of 1.3 billion; the average population per neurologist exceeds 3.3 million in Africa. On the other hand, 60,000 people are diagnosed with Parkinson's disease (PD) every year in the US alone, and similar patterns of rising PD cases — fueled mostly by environmental pollution and an aging population can be seen worldwide. The current projection of more than 12 million PD patients worldwide by 2040 is only part of the picture since more than 20% of PD patients remain undiagnosed. Timely diagnosis and frequent assessment are keys to ensure timely and appropriate medical intervention, improving the quality of life for a PD patient. OBJECTIVE In this paper, we envision a web-based framework that can help anyone, anywhere around the world record a short speech task, and analyze the recorded data to screen for Parkinson’s disease (PD). METHODS We collected data from 726 unique participants (262 PD, 38% female; 464 non-PD, 65% female; average age: 61) – from all over the US and beyond. A small portion of the data (roughly 7%) was collected in a lab setting to compare quality. The participants were instructed to utter a popular pangram containing all the letters in the English alphabet “the quick brown fox jumps over the lazy dog”. We extracted both standard acoustic features (Mel Frequency Cepstral Coefficients (MFCC), jitter and shimmer variants) and deep learning-based features from the speech data. Using these features, we trained several machine learning algorithms. We also applied model interpretation techniques like SHAP (SHapley Additive exPlanations) to find out the importance of each feature in determining the model’s output. RESULTS We achieved 0.75 AUC (Area Under the Curve) performance on determining presence of self-reported Parkinson’s disease by modeling the standard acoustic features through the XGBoost – a gradient-boosted decision tree model. Further analysis reveals that the widely used MFCC features and a subset of previously validated dysphonia features designed for detecting Parkinson’s from verbal phonation task (pronouncing ‘ahh’) influence the model’s decision most. CONCLUSIONS Our model performed equally well on data collected in controlled lab environment as well as ‘in the wild’ across different gender and age groups. Using this tool, we can collect data from almost anyone anywhere with a video/audio enabled device, contributing to equity and access in neurological care.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.