2020
DOI: 10.1186/s13636-019-0169-5
|View full text |Cite
|
Sign up to set email alerts
|

Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network

Abstract: In this paper, we use empirical mode decomposition and Hurst-based mode selection (EMDH) along with deep learning architecture using a convolutional neural network (CNN) to improve the recognition of dysarthric speech. The EMDH speech enhancement technique is used as a preprocessing step to improve the quality of dysarthric speech. Then, the Mel-frequency cepstral coefficients are extracted from the speech processed by EMDH to be used as input features to a CNN-based recognizer. The effectiveness of the propos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 40 publications
(12 citation statements)
references
References 15 publications
0
8
0
Order By: Relevance
“…Compared to the standard system this new approach shows satisfactory results. [3] Arjun et al, (2019), devised a method which is used to correct the stutters found in a speech signal. To avoid the recurrence of same word, the speech is sampled into individual words by using appropriate thresholding and speech energy techniques.…”
Section: Literature Surveymentioning
confidence: 99%
“…Compared to the standard system this new approach shows satisfactory results. [3] Arjun et al, (2019), devised a method which is used to correct the stutters found in a speech signal. To avoid the recurrence of same word, the speech is sampled into individual words by using appropriate thresholding and speech energy techniques.…”
Section: Literature Surveymentioning
confidence: 99%
“…According to reference [8], the pre-processing stage utilizes the speech improvement strategy of EMDH to enhance the speech quality of dysarthria. From the EMDH-processed speech, the cepstral coefficients of Mel frequency are extracted and sent into a CNN-based recognizer as input characteristics.The findings imply that the CNN is capable of retrieving latent characteristics of dysarthria speech and that it may be trained faster with fewer data.…”
Section: Related Workmentioning
confidence: 99%
“…The experiment results of this study showed that the CNN-based feature extraction from the MFCC map provided better word-recognition results than other conventional feature extraction methods. More recently, Yakoub et al [43] proposed an empirical model decomposition and Hurst-based model selection (EMDH)-CNN system to improve the recognition of dysarthric speech. The results showed that the proposed system provided higher accuracy than the hidden Markov with Gaussian Mixture model and the CNN model by 20.72% and 9.95%, respectively.…”
Section: Introductionmentioning
confidence: 99%