2023
DOI: 10.1002/lio2.1144
|View full text |Cite
|
Sign up to set email alerts
|

End‐to‐end deep learning classification of vocal pathology using stacked vowels

George S. Liu,
Jordan M. Hodges,
Jingzhi Yu
et al.

Abstract: ObjectivesAdvances in artificial intelligence (AI) technology have increased the feasibility of classifying voice disorders using voice recordings as a screening tool. This work develops upon previous models that take in single vowel recordings by analyzing multiple vowel recordings simultaneously to enhance prediction of vocal pathology.MethodsVoice samples from the Saarbruecken Voice Database, including three sustained vowels (/a/, /i/, /u/) from 687 healthy human participants and 334 dysphonic patients, wer… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 44 publications
0
1
0
Order By: Relevance
“…Apart from these, there are different end-to-end approaches that handle all aspects of an audio profiling process, from the initial input to the final output, without requiring manual intervention or separate processing stages. Raw waveforms are taken as an input and different audio profiling tasks such as age estimation [36] [37], voice pathology detection [38], Speech Emotion Recognition [39], acoustic scene classification [40] [41] are obtained as output directly. End-to-end systems perform automatic feature extraction learning directly from raw inputs without manual intervention.…”
Section: B Feature Extractionmentioning
confidence: 99%
“…Apart from these, there are different end-to-end approaches that handle all aspects of an audio profiling process, from the initial input to the final output, without requiring manual intervention or separate processing stages. Raw waveforms are taken as an input and different audio profiling tasks such as age estimation [36] [37], voice pathology detection [38], Speech Emotion Recognition [39], acoustic scene classification [40] [41] are obtained as output directly. End-to-end systems perform automatic feature extraction learning directly from raw inputs without manual intervention.…”
Section: B Feature Extractionmentioning
confidence: 99%