Recommended by Juan I. Godino-Llorente Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
In this paper, we use empirical mode decomposition and Hurst-based mode selection (EMDH) along with deep learning architecture using a convolutional neural network (CNN) to improve the recognition of dysarthric speech. The EMDH speech enhancement technique is used as a preprocessing step to improve the quality of dysarthric speech. Then, the Mel-frequency cepstral coefficients are extracted from the speech processed by EMDH to be used as input features to a CNN-based recognizer. The effectiveness of the proposed EMDH-CNN approach is demonstrated by the results obtained on the Nemours corpus of dysarthric speech. Compared to baseline systems that use Hidden Markov with Gaussian Mixture Models (HMM-GMMs) and a CNN without an enhancement module, the EMDH-CNN system increases the overall accuracy by 20.72% and 9.95%, respectively, using a k-fold cross-validation experimental setup.
In this paper, we propose a mobile spoken dialogue system with a new spoken dialogue understanding architecture (SLU). This new SLU module combines an ontology and a dependency graph to do semantic analysis. The turn analysis algorithm integrated in the SLU module uses, at each turn of the dialogue, the dependencies generated by Stanford parser and a domain ontology to analyze the sentence and to extract user's intention and slots values (i.e. user dialogue acts, concepts and their values). The SLU module maps the sentence into a structure. The dialogue manager receives this mapping structure from the SLU module and chooses the action to be taken by the system. The mobile spoken dialogue system was developed as a remote system on a mobile phone. It utilizes Google server for recognition and Google text to speech for speech synthesis provided with Android system. Dialogue understanding, dialogue managing and text generating modules reside on a remote computer. Ten users have tested the first version of our system and a score of 3.6 on Likert scale was obtained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.