M. Kiran Reddy scite author profile

Developmental dysphasia, also known as specific language impairment (SLI), is a language disorder in children that involves difficulty in speaking and understanding spoken words. Detecting SLI at an early stage is very important for successful speech therapy in children. In this paper, we propose a novel approach based on glottal source features for detecting children with SLI using the speech signal. The proposed method utilizes time-and frequency-domain glottal parameters, which are extracted from the voice source signal obtained using glottal inverse filtering (GIF). In addition, Mel-frequency cepstral coefficient (MFCC) and openSMILE based acoustic features are also extracted from speech utterances. Two machine learning algorithms, namely, support vector machine (SVM) and feed-forward neural network (FFNN), are trained separately for the MFCC, openSMILE and glottal features. A leave-fourteen-speakers-out cross-validation strategy is used for evaluating the classifiers. The experiments are conducted using the SLI speech corpus launched by the LANNA research group. Experimental results show that the glottal parameters contain significant discriminative information required for identifying children with SLI. Furthermore, the complementary nature of glottal parameters is investigated by independently combining these features with the MFCC and openSMILE acoustic features. The overall results indicate that the glottal features when used in combination with MFCC feature set provides the best performance with the FFNN classifier in the speaker-independent scenario. INDEX TERMS Developmental dysphasia, openSMILE, glottal source parameters, support vector machines, artificial neural networks.

show abstract

Robust Pitch Extraction Method for the HMM-Based Speech Synthesis System

Reddy

Rao

2017

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

A Comparison of Cepstral Features in the Detection of Pathological Voices by Varying the Input and Filterbank of the Cepstrum Computation

Reddy¹,

Alku

2021

IEEE Access

View full text Add to dashboard Cite

Automatic voice pathology detection enables objective assessment of pathologies that affect the voice production mechanism. Detection systems have been developed using the traditional pipeline approach (consisting of the feature extraction part and the detection part) and using the modern deep learning -based end-to-end approach. Due to the lack of vast amounts of training data in the study area of pathological voice, the former approach is still a valid choice. In the existing detection systems based on the traditional pipeline approach, the mel-frequency cepstral coefficient (MFCC) features can be regarded as the defacto standard feature set. In this study, automatic voice pathology detection is investigated by comparing the performance of various MFCC variants derived by considering two factors: the input and the filterbank in the cepstrum computation. For the first factor, three inputs (the voice signal, the glottal source and the vocal tract) are compared. The glottal source and the vocal tract are estimated using the quasi-closed phase glottal inverse filtering method. For the second factor, the mel-frequency and linear-frequency filterbanks are compared. Experiments were conducted separately using six databases consisting of voices produced by speakers suffering from one of four disorders (dysphonia, Parkinson's disease, laryngitis, or heart failure) and by healthy speakers. Support vector machine (SVM) was used as the classifier. The results show that by combining mel-and linear-frequency cepstral coefficients derived from the glottal source and vocal tract, better overall detection accuracy was obtained compared to the defacto MFCC features derived from the voice signal. Furthermore, this combination provided comparable or better performance than four existing cepstral feature extraction techniques in clean and high signal-to-noise ratio (SNR) conditions.INDEX TERMS Voice disorders, glottal inverse filtering, support vector machine, cepstral coefficients.

show abstract

Multilingual and multimode phone recognition system for Indian languages

Tripathi

Reddy

Rao

2020

Speech Communication

View full text Add to dashboard Cite

The aim of this paper is to develop a flexible framework capable of automatically recognizing phonetic units present in a speech utterance of any language spoken in any mode. In this study, we considered two modes of speech: conversation, and read modes in four Indian languages, namely, Telugu, Kannada, Odia, and Bengali. The proposed approach consists of two stages: (1) Automatic speech mode classification (SMC) and(2) Automatic phonetic recognition using mode-specific multilingual phone recognition system (MPRS). In this work, vocal tract and excitation souce features are considered for speech mode classification (SMC) task. SMC systems are developed using multilayer perceptron (MLP). Further, vocal tract, excitation source, and tandem features are used to build the deep neural network (DNN)-based MPRSs. The performance of the proposed approach is compared with mode-dependent MPRSs. Experimental results show that the proposed approach which combines both SMC and MPRS into a single system outperforms the baseline mode-dependent MPRSs.

show abstract

The automatic detection of heart failure using speech signals

Reddy

Helkkula

Keerthana

et al. 2021

Computer Speech & Language

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

M. Kiran Reddy

Detection of Specific Language Impairment in Children Using Glottal Source Features

Robust Pitch Extraction Method for the HMM-Based Speech Synthesis System

A Comparison of Cepstral Features in the Detection of Pathological Voices by Varying the Input and Filterbank of the Cepstrum Computation

Multilingual and multimode phone recognition system for Indian languages

The automatic detection of heart failure using speech signals

Contact Info

Product

Resources

About