Diagnosis of pathological voice is one of the most important issues in biomedical applications of speech technology. This study focuses on the classification of pathological voice using the HMM(Hidden Markov Model), the GMM(Gaussian Mixture Model) and a SVM (Support Vector Machine), and then compares the results to work done previously using an ANN (Artificial Neural Network). Speech data were collected from those without and those with vocal disorders. Normal and pathological speech data were mixed in out experiment. Six characteristic parameters (Jitter, Shimmer, NHR, SPI, APQ and RAP) were chosen. Then the pattern recognition methods (HMM, GMM and SVM) were used to distinguish the mixed data into categories of normal and pathological speech. We found that the GMM-based method can give us superior classification rates compared to the other classification methods.
The aim of this paper is to analyze and discriminate the pathological voice by separating signal into periodic and aperiodic parts. Separation was performed recursively from the residual signal of voice signal. Based on initial estimation of aperiodic part of spectrum, aperiodic part is decided from the extrapolation method. Periodic part is decided by subtracting aperiodic part from the original spectrum. A parameter HNR is derived based on the separation. Parameter value statistics are compared with those of Jitter and Shimmer for normal, benign and malignant cases.
The aim of this study is to synthesize pathological breathy voice and to make a cepstral peak prominence (CPP) table following breathiness ranks by cepstral analysis to supplement reliability of the perceptual auditory judgment task. KlattGrid synthesizer included in Praat was used. Synthesis parameters consist of two groups, i.e., constants and variables. Constant parameters are pitch, amplitude, flutter, open phase, oral formant and bandwidth. Variable parameters are breathiness (BR), aspiration amplitude (AH), and spectral tilt (TL). Five hundred sixty samples of synthetic breathy vowel /a/ for male were created. Three raters participated in ranking of the breathiness. 217 were proved to be inadequate samples from perceptual judgment and cepstral analysis. Finally, 343 samples were selected. These CPP values and other related parameters from cepstral analysis are classified under four breathiness ranks (B0~B3). The mean and standard deviation of CPP is 16.10±1.15 dB(B0), 13.68±1.34 dB(B1), 10.97±1.41 dB(B2), and 3.03±4.07 dB(B3). The value of CPP decreases toward the severe group of breathiness because there is a lot of noise and a small quantity of harmonics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.