2023
DOI: 10.1109/ojsp.2023.3242862
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Multi-Class Classification of Voice Disorders Using Self-Supervised Models and Glottal Features

Abstract: Previous studies on the automatic classification of voice disorders have mostly investigated the binary classification task, which aims to distinguish pathological voice from healthy voice. Using multi-class classifiers, however, more fine-grained identification of voice disorders can be achieved, which is more helpful for clinical practitioners. Unfortunately, there is little publicly available training data for many voice disorders, which lowers the classification performance on data from unseen speakers. Ea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 18 publications
(1 citation statement)
references
References 40 publications
0
1
0
Order By: Relevance
“…The model learns representations from the raw speech, and these representations can be used in the required downstream task. Examples of pre-trained models are wav2vec2 and HuBERT that have shown good performance in various speech technology tasks, such as ASR, emotion recognition, speaker and language identification, and voice disorder detection [47,48,49,50,51,52,53]. There are, however, no studies on using recent self-supervised pre-trained models, such as wav2vec2 [47] and HuBERT [54], for voice quality classification.…”
Section: Introductionmentioning
confidence: 99%
“…The model learns representations from the raw speech, and these representations can be used in the required downstream task. Examples of pre-trained models are wav2vec2 and HuBERT that have shown good performance in various speech technology tasks, such as ASR, emotion recognition, speaker and language identification, and voice disorder detection [47,48,49,50,51,52,53]. There are, however, no studies on using recent self-supervised pre-trained models, such as wav2vec2 [47] and HuBERT [54], for voice quality classification.…”
Section: Introductionmentioning
confidence: 99%