2021
DOI: 10.48550/arxiv.2106.01555
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Comparing Acoustic-based Approaches for Alzheimer's Disease Detection

Abstract: Robust strategies for Alzheimer's disease (AD) detection is important, given the high prevalence of AD. In this paper, we study the performance and generalizability of three approaches for AD detection from speech on the recent ADReSSo challenge dataset:1) using conventional acoustic features 2) using novel pre-trained acoustic embeddings 3) combining acoustic features and embeddings. We find that while feature-based approaches have a higher precision, classification approaches relying on the combination of em… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…SSL models are specifically proposed for speech representations, such as TRILL [19], PACE+ [46], Mockingjay [39], Wav2Vec 2.0 [40], and HuBERT [41]. In particular, non-speech recognition application studies such as [7] use Wav2Vec 2.0 as a feature extractor.…”
Section: B Audio Pre-training Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…SSL models are specifically proposed for speech representations, such as TRILL [19], PACE+ [46], Mockingjay [39], Wav2Vec 2.0 [40], and HuBERT [41]. In particular, non-speech recognition application studies such as [7] use Wav2Vec 2.0 as a feature extractor.…”
Section: B Audio Pre-training Methodsmentioning
confidence: 99%
“…Temporal pooling ablation results of (6) mean or (7) max pooling show that max pooling performs better than mean pooling on five tasks with a margin of 3.0 to 8.7%, indicating that max pooling can be more advantageous in general. While the (1) combination of both statistics slightly degrades performance on some tasks, it improves on average, showing that tasks benefit from the combination of these statistics.…”
Section: F Ablations Of Encoder Global Pooling Blocksmentioning
confidence: 99%
See 1 more Smart Citation
“…SSL models are specifically proposed for speech representations, including Wav2Vec 2.0 [42], PACE+ [43], and TRILL [19]. In particular, Wav2Vec 2.0 was fine-tuned on the automatic speech recognition (ASR) task and provided state-of-the-art performance, and it has also been used in non-ASR application studies [7].…”
Section: B Audio Pre-training Methodsmentioning
confidence: 99%
“…Once sound is converted to textual information, signals that should have been present are missing as a result of dementia. There are models that use both speech signals and textual information to predict the risk of dementia (19).…”
Section: Early Models For Detection Of Dementia Using Deep Learningmentioning
confidence: 99%