Multimodal Inductive Transfer Learning for Detection of Alzheimer’s Dementia and its Severity

Sarawgi, Utkarsh; Zulfikar, Wazeer; Soliman, Nouran; Maes, Pattie

doi:10.21437/interspeech.2020-3137

Cited by 40 publications

(24 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Karlekar et al ( 2018 ) achieved 91% accuracy using a Convoluted Neural Network (CNN)-RNN model trained on part-of-speech-tagged utterances. Using CNN on both DementiaBank and ADReSS data, Sarawgi et al ( 2020 ) presented an ensemble of three models: disfluencies, acoustic, and intervention. Balagopalan et al ( 2020 ) and Pappagari et al ( 2020 ) showed that fine-tuned bidirectional encoder representations from transformers (BERT) outperformed models with hand-engineered features.…”

Section: Related Workmentioning

confidence: 99%

Classification of Alzheimer’s Disease Leveraging Multi-task Machine Learning Analysis of Speech and Eye-Movement Data

Jang

Soroski

Rizzo

et al. 2021

Front. Hum. Neurosci.

View full text Add to dashboard Cite

Alzheimer’s disease (AD) is a progressive neurodegenerative condition that results in impaired performance in multiple cognitive domains. Preclinical changes in eye movements and language can occur with the disease, and progress alongside worsening cognition. In this article, we present the results from a machine learning analysis of a novel multimodal dataset for AD classification. The cohort includes data from two novel tasks not previously assessed in classification models for AD (pupil fixation and description of a pleasant past experience), as well as two established tasks (picture description and paragraph reading). Our dataset includes language and eye movement data from 79 memory clinic patients with diagnoses of mild-moderate AD, mild cognitive impairment (MCI), or subjective memory complaints (SMC), and 83 older adult controls. The analysis of the individual novel tasks showed similar classification accuracy when compared to established tasks, demonstrating their discriminative ability for memory clinic patients. Fusing the multimodal data across tasks yielded the highest overall AUC of 0.83 ± 0.01, indicating that the data from novel tasks are complementary to established tasks.

show abstract

Section: Related Workmentioning

confidence: 99%

Classification of Alzheimer’s Disease Leveraging Multi-task Machine Learning Analysis of Speech and Eye-Movement Data

Jang

Soroski

Rizzo

et al. 2021

Front. Hum. Neurosci.

View full text Add to dashboard Cite

show abstract

“…Following several other works that used the DB data set (Hernández-Domínguez et al, 2018;Pou-Prom and Rudzicz, 2018;Sarawgi et al, 2020), all of our experiments are conducted with K-fold cross validation. While the small size of the DB data set helps to justify this as a validation procedure, optimizing a cross validated performance metric (accuracy, F1, etc.)…”

Section: Discussionmentioning

confidence: 99%

Fantastic Features and Where to Find Them: Detecting Cognitive Impairment with a Subsequence Classification Guided Approach

Eyre¹,

Balagopalan

Novikova

2020

Proceedings of the Sixth Workshop on Noisy User-Generated Text (W-Nut 2020)

View full text Add to dashboard Cite

Despite the widely reported success of embedding-based machine learning methods on natural language processing tasks, the use of more easily interpreted engineered features remains common in fields such as cognitive impairment (CI) detection. Manually engineering features from noisy text is time and resource consuming, and can potentially result in features that do not enhance model performance. To combat this, we describe a new approach to feature engineering that leverages sequential machine learning models and domain knowledge to predict which features help enhance performance. We provide a concrete example of this method on a standard data set of CI speech and demonstrate that CI classification accuracy improves by 2.3% over a strong baseline when using features produced by this method. This demonstration provides an example of how this method can be used to assist classification in fields where interpretability is important, such as health care.

show abstract

“…Methods with bimodal input features (both acoustic and linguistic) are also used for AD recognition in various studies (Sarawgi et al, 2020a;Sarawgi et al, 2020b;Campbell et al, 2020;Koo et al, 2020;Pompili et al, 2020;Rohanian et al, 2020). However, in this work, we restrict ourselves to the NLP-based approaches.…”

Section: Bimodal Methodsmentioning

confidence: 99%

Recognition of Alzheimer’s Dementia From the Transcriptions of Spontaneous Speech Using fastText and CNN Models

Meghanani

Anoop

Ramakrishnan

2021

Front. Comput. Sci.

View full text Add to dashboard Cite

Alzheimer’s dementia (AD) is a type of neurodegenerative disease that is associated with a decline in memory. However, speech and language impairments are also common in Alzheimer’s dementia patients. This work is an extension of our previous work, where we had used spontaneous speech for Alzheimer’s dementia recognition employing log-Mel spectrogram and Mel-frequency cepstral coefficients (MFCC) as inputs to deep neural networks (DNN). In this work, we explore the transcriptions of spontaneous speech for dementia recognition and compare the results with several baseline results. We explore two models for dementia recognition: 1) fastText and 2) convolutional neural network (CNN) with a single convolutional layer, to capture the n-gram-based linguistic information from the input sentence. The fastText model uses a bag of bigrams and trigrams along with the input text to capture the local word orderings. In the CNN-based model, we try to capture different n-grams (we use n = 2, 3, 4, 5) present in the text by adapting the kernel sizes to n. In both fastText and CNN architectures, the word embeddings are initialized using pretrained GloVe vectors. We use bagging of 21 models in each of these architectures to arrive at the final model using which the performance on the test data is assessed. The best accuracies achieved with CNN and fastText models on the text data are 79.16 and 83.33%, respectively. The best root mean square errors (RMSE) on the prediction of mini-mental state examination (MMSE) score are 4.38 and 4.28 for CNN and fastText, respectively. The results suggest that the n-gram-based features are worth pursuing, for the task of AD detection. fastText models have competitive results when compared to several baseline methods. Also, fastText models are shallow in nature and have the advantage of being faster in training and evaluation, by several orders of magnitude, compared to deep models.

show abstract

Multimodal Inductive Transfer Learning for Detection of Alzheimer’s Dementia and its Severity

Cited by 40 publications

References 25 publications

Classification of Alzheimer’s Disease Leveraging Multi-task Machine Learning Analysis of Speech and Eye-Movement Data

Classification of Alzheimer’s Disease Leveraging Multi-task Machine Learning Analysis of Speech and Eye-Movement Data

Fantastic Features and Where to Find Them: Detecting Cognitive Impairment with a Subsequence Classification Guided Approach

Recognition of Alzheimer’s Dementia From the Transcriptions of Spontaneous Speech Using fastText and CNN Models

Contact Info

Product

Resources

About