Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-303
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Stage DNN Training for Automatic Recognition of Dysarthric Speech

Abstract: Incorporating automatic speech recognition (ASR) in individualized speech training applications is becoming more viable thanks to the improved generalization capabilities of neural network-based acoustic models. The main problem in developing applications for dysarthric speech is the relative in-domain data scarcity. Collecting representative amounts of dysarthric speech data is difficult due to rigorous ethical and medical permission requirements, problems in accessing patients who are generally vulnerable an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
24
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 16 publications
(25 citation statements)
references
References 25 publications
1
24
0
Order By: Relevance
“…Containing more speakers with more diverse etiologies, performing ASR on this corpus is found to more challenging compared to the CHASING01 dysarthric speech database (c.f. the ASR results in [18] and [21]).…”
Section: Speech Corpora Selectionmentioning
confidence: 99%
“…Containing more speakers with more diverse etiologies, performing ASR on this corpus is found to more challenging compared to the CHASING01 dysarthric speech database (c.f. the ASR results in [18] and [21]).…”
Section: Speech Corpora Selectionmentioning
confidence: 99%
“…In this section, we present the results of the ASR experiments performed using the acoustic models in Section 4.1 trained on several speaker-independent features such as gammatone filterbanks, articulatory features and bottleneck features. Later, we apply the model adaptation approach we described in [29] to explore the further gains that can be obtained using a small amount of training dysarthric data. The baseline systems use DNN, CNN and TFCNN models trained on mel filterbank features and FCNN models trained using the concatenation of mel filterbank features and synthetic AFs.…”
Section: Resultsmentioning
confidence: 99%
“…In the scope of the CHASING Project, we have been developing a serious game employing ASR to provide additional speech therapy to dysarthric patients [14,27]. Earlier research in this project is reported in [26] and [29] in which we focus on investigating the available resources by using speech data from different varieties of the Dutch language and investigate model adaptation to tune an acoustic model using a small amount of dysarthric speech for training purposes respectively.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations