Dysarthria is a degenerative disorder of the central nervous system that affects the control of articulation and pitch; therefore, it affects the uniqueness of sound produced by the speaker. Hence, dysarthric speaker recognition is a challenging task. In this paper, a feature‐extraction method based on deep belief networks is presented for the task of identifying a speaker suffering from dysarthria. The effectiveness of the proposed method is demonstrated and compared with well‐known Mel‐frequency cepstral coefficient features. For classification purposes, the use of a multi‐layer perceptron neural network is proposed with two structures. Our evaluations using the universal access speech database produced promising results and outperformed other baseline methods. In addition, speaker identification under both text‐dependent and text‐independent conditions are explored. The highest accuracy achieved using the proposed system is 97.3%.
Nowadays this is very popular to use deep architectures in machine learning. Deep Belief Networks (DBNs) have deep architectures to create a powerful generative model using training data. Deep Belief Networks can be used in classification and feature learning. A DBN can be learnt unsupervised and then the learnt features are suitable for a simple classifier (like a linear classifier) with a few labeled data. According to researches, training of DBN can be improved to produce features with more interpretability and discrimination ability. One of these improvements is sparsity in learnt features in DBN. By using sparsity we can learn useful low-level feature representations for unlabeled data. In sparse representation we benefit from this property that the learnt features can be interpreted, i.e. they correspond to meaningful aspects of the input, and capture factors of variation in the data. Different methods have been proposed to build sparse RBMs. In this paper we propose a new method namely nsDBN that has different behaviors according to deviation of the activation of the hidden units from a (low) fixed value. Also our proposed method has a variance parameter that can control the force degree of sparseness. According to the results, our new method compared to the state of the art methods including peA, RBM, qsRBM, and rdsRBM always achieves the best recognition accuracy on the MNIST hand written digit recognition test set even when only 10 to 20 labeled samples per class are used as training data.
to create a powerful generative model using training data. In this paper we present an improvement in a common method that is usually used in training of RBMs. The new method uses free energy as a criterion to obtain elite samples from generative model. We argue that these samples can more accurately compute gradient of log probability of training data. According to the results, an error rate of 0.99% was achieved on MNIST test set. This result shows that the proposed method outperforms the method presented in the first paper introducing DBN (1.25% error rate) and general classification methods such as SVM (1.4% error rate) and KNN (with 1.6% error rate). In another test using ISOLET dataset, letter classification error dropped to 3.59% compared to 5.59% error rate achieved in those papers using this dataset. The implemented method is available online at "http://ceit.aut.ac.ir/~keyvanrad/DeeBNet Toolbox.html".
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.