Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-2180
|View full text |Cite
|
Sign up to set email alerts
|

Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-Encoders

Abstract: Objective speech disorder classification for speakers with communication difficulty is desirable for diagnosis and administering therapy. With the current state of speech technology, it is evident to propose neural networks for this application. But neural network model training is hampered by a lack of labeled disordered speech data. In this research, we apply an extended version of Factorized Hierarchical Variational Autoencoders (FHVAE) for representation learning on disordered speech. The FHVAE model extra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…The dysarthria information, as a speaker characteristic, should be extracted to speaker variable z i,n 2 by the FHVAE. However, our previous work [16] found that the FHVAE does not separate the dysarthria and content information and speech impairment is identifiable from z i,n 1 . To obtain dysarthria-invariant features z i,n 2 , inspired by [17], we introduce adversarial training into the FHVAE model.…”
Section: Fhvae With Adversarial Trainingmentioning
confidence: 85%
See 2 more Smart Citations
“…The dysarthria information, as a speaker characteristic, should be extracted to speaker variable z i,n 2 by the FHVAE. However, our previous work [16] found that the FHVAE does not separate the dysarthria and content information and speech impairment is identifiable from z i,n 1 . To obtain dysarthria-invariant features z i,n 2 , inspired by [17], we introduce adversarial training into the FHVAE model.…”
Section: Fhvae With Adversarial Trainingmentioning
confidence: 85%
“…The content variable (segment-related variable) appears to fit the speaker-independence desideratum outlined above. However, our previous work [16] shows that the dysarthria is reflected in both latent variables.…”
Section: Introductionmentioning
confidence: 87%
See 1 more Smart Citation