2021
DOI: 10.48550/arxiv.2106.00531
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Supervised Speech Representation Learning for Parkinson's Disease Classification

Parvaneh Janbakhshi,
Ina Kodrasi

Abstract: Recently proposed automatic pathological speech classification techniques use unsupervised auto-encoders to obtain a high-level abstract representation of speech. Since these representations are learned based on reconstructing the input, there is no guarantee that they are robust to pathology-unrelated cues such as speaker identity information. Further, these representations are not necessarily discriminative for pathology detection. In this paper, we exploit supervised auto-encoders to extract robust and disc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 24 publications
0
1
0
Order By: Relevance
“…Namely, the performances of our best IFMs on CzechPD and ItalianPVS outperformed those reported by Kovac et al [41,42] based on acoustic descriptors by more than 10 percentual absolute points in ACC. Similarly, TRILLsson reported best AUCs of 0.95, 0.87, 0.84 on Neurovoz, GITA, and GermanPD, which were 0.01, 0.03, and 0.16 higher than the x-vector approach of [31], the CNN approach of [28], and the mono-lingual CNN approach of [27], respectively. In this regard, HuBERT/Wav2Vec 2.0. reported best AUCs of 0.96, 0.90, and 0.88 on Neurovoz, GITA, and GermanPD, respectively, which were even higher than those achieved with TRILLsson.…”
Section: Interpretable Vs Non-interpretable Featuresmentioning
confidence: 87%
“…Namely, the performances of our best IFMs on CzechPD and ItalianPVS outperformed those reported by Kovac et al [41,42] based on acoustic descriptors by more than 10 percentual absolute points in ACC. Similarly, TRILLsson reported best AUCs of 0.95, 0.87, 0.84 on Neurovoz, GITA, and GermanPD, which were 0.01, 0.03, and 0.16 higher than the x-vector approach of [31], the CNN approach of [28], and the mono-lingual CNN approach of [27], respectively. In this regard, HuBERT/Wav2Vec 2.0. reported best AUCs of 0.96, 0.90, and 0.88 on Neurovoz, GITA, and GermanPD, respectively, which were even higher than those achieved with TRILLsson.…”
Section: Interpretable Vs Non-interpretable Featuresmentioning
confidence: 87%