This paper proposes the use of Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units for determining whether Mandarin-speaking individuals are afflicted with a form of Dysarthria based on samples of syllable pronunciations. Several LSTM network architectures are evaluated on this binary classification task, using accuracy and Receiver Operating Characteristic (ROC) curves as metrics. The LSTM models are shown to significantly improve upon a baseline fully connected network, reaching over 90% area under the ROC curve on the task of classifying new speakers, when a sufficient number of cepstrum coefficients are used. The results show that the LSTM's ability to leverage temporal information within its input makes for an effective step in the pursuit of accessible Dysarthria diagnoses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.