For several decades now, there has been sporadic interest in automatically characterizing the speech impairment due to Parkinson’s disease (PD). Most early studies were confined to quantifying a few speech features that were easy to compute. More recent studies have adopted a machine learning approach where a large number of potential features are extracted and the models are learned automatically from the data. In the same vein, here we characterize the disease using a relatively large cohort of 168 subjects, collected from multiple (three) clinics. We elicited speech using three tasks – the sustained phonation task, the diadochokinetic task and a reading task, all within a time budget of 4 minutes, prompted by a portable device. From these recordings, we extracted 1582 features for each subject using openSMILE, a standard feature extraction tool. We compared the effectiveness of three strategies for learning a regularized regression and find that ridge regression performs better than lasso and support vector regression for our task. We refine the feature extraction to capture pitch-related cues, including jitter and shimmer, more accurately using a time-varying harmonic model of speech. Our results show that the severity of the disease can be inferred from speech with a mean absolute error of about 5.5, explaining 61% of the variance and consistently well-above chance across all clinics. Of the three speech elicitation tasks, we find that the reading task is significantly better at capturing cues than diadochokinetic or sustained phonation task. In all, we have demonstrated that the data collection and inference can be fully automated, and the results show that speech-based assessment has promising practical application in PD. The techniques reported here are more widely applicable to other paralinguistic tasks in clinical domain.
1 Research Highlights (Required)To create your highlights, please type the highlights against each \item command. It should be short collection of bullet points that convey the core findings of the article. It should include 3 to 5 bullet points (maximum 85 characters, including spaces, per bullet point.)• We proposed a new method to cluster multiple manifolds with the intersection.• We define a new notion of distance between points based on shortest constrained path.• We apply our method to simulated and some real datasets and get good results. ABSTRACTThe massive amount of high-dimensional data in science and engineering demands new trends in data analysis. Subspace techniques have shown remarkable success in numerous problems in computer vision and data mining, where the goal is to recover the low-dimensional structure of data in an ambient space. Traditional subspace methods like PCA and ICA assume that the data is coming from a single manifold. However, the data might come from several (possibly intersected) manifolds (surfaces). This has caused the development of new nonlinear techniques to cluster subspaces of high-dimensional data. In this paper, we propose a new algorithm for subspace clustering of data, where the data consists of several possibly intersected manifolds. To this end, we first propose a curvature constraint to find the shortest path between data points and then use it in Isomap for subspace learning. The algorithm chooses several landmark nodes at random and then checks whether there is a curvature constrained path between each landmark node and all other nodes in the neighborhood graph. It builds a binary feature vector for each point where each entry represents the the connectivity of that point to a particular landmark. Then the binary feature vectors could be used as a input of conventional clustering algorithms such as hierarchical clustering. The performed experiments on both synthetic and real data sets confirm the performance of our algorithm.
In this paper, we investigate the problem of detecting social contexts from the audio recordings of everyday life such as in life-logs. Unlike the standard corpora of telephone speech or broadcast news, these recordings have a wide variety of background noise. By nature, in such applications, it is difficult to collect and label all the representative noise for learning models in a fully supervised manner. The amount of labeled data that can be expected is relatively small compared to the available recordings. This lends itself naturally to unsupervised feature extraction using sparse auto-encoders, followed by supervised learning of a classifier for social contexts. We investigate different strategies for training these models and report results on a real-world application.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.