ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
DOI: 10.1109/icassp49357.2023.10095547
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 46 publications
0
2
0
Order By: Relevance
“…For example, dysarthric speakers of very low speech intelligibility exhibit clearer patterns of articulatory imprecision, decreased volume and clarity, increased dysfluencies, slower speaking rate and changes in pitch [29], while those diagonalized with mid or high speech intelligibility are closer to normal speakers. Such heterogeneity further increases the mismatch against normal speech and the difficulty in both speaker-independent (SI) ASR system development using limited impaired speech data and fine-grained personalization to individual users' data [3,25,30] So far the majority of prior researches to address the dysarthric speaker level diversity have been focused on using speaker-identity only either in speaker-dependent (SD) data augmentation [7,9,13,14,18,27], or in speaker adapted or dependent ASR system development [1, 3, 4, 7, 11-13, 19, 22, 25, 31-33]. In contrast, very limited prior researches have used speech impairment severity information for dysarthric speech recognition.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, dysarthric speakers of very low speech intelligibility exhibit clearer patterns of articulatory imprecision, decreased volume and clarity, increased dysfluencies, slower speaking rate and changes in pitch [29], while those diagonalized with mid or high speech intelligibility are closer to normal speakers. Such heterogeneity further increases the mismatch against normal speech and the difficulty in both speaker-independent (SI) ASR system development using limited impaired speech data and fine-grained personalization to individual users' data [3,25,30] So far the majority of prior researches to address the dysarthric speaker level diversity have been focused on using speaker-identity only either in speaker-dependent (SD) data augmentation [7,9,13,14,18,27], or in speaker adapted or dependent ASR system development [1, 3, 4, 7, 11-13, 19, 22, 25, 31-33]. In contrast, very limited prior researches have used speech impairment severity information for dysarthric speech recognition.…”
Section: Introductionmentioning
confidence: 99%
“…A set of novel techniques and recipe configurations were proposed to learn both speech impairment severity and speaker-identity when constructing and personalizing these systems. In contrast, prior researches mainly focused on using speaker-identity only in speaker-dependent data augmentation [7,9,13,14,18,27] and speaker adapted or dependent ASR system development [1,3,4,7,11,13,19,22,23,25,[31][32][33]. Very limited prior researches utilized speech impairment severity information [2,11,25,[34][35][36].…”
Section: Introductionmentioning
confidence: 99%