Mel-Frequency Cepstral Coefficients (MFCC) are features widely and successfully used for various speech processing applications. These features are extracted using Fourier transform. However, this transform suffers from some crucial restrictions when used for analyzing nonlinear and non-stationary signals such as speech. To address this problem, in the present study, we investigate the application of Empirical Mode Decomposition (EMD) in extracting more efficient and robust features for automatic gender identification. In particular, in the proposed approach, the speech signal is first decomposed into a set of narrow-band oscillatory modes, using EMD, from which mel-frequency cepstral features can be extracted. On the other hand, multi-band decomposition of all modes results in some redundant and even irrelevant features that can degrade the performance of the classification. Therefore, we propose to efficiently select the most discriminative frequency bands over all modes. The minimal-redundancy-maximal-relevance (mRMR) feature selection algorithm is also examined for this purpose. The proposed EMD-based features are then extracted by applying DCT on log power values calculated over the selected mel-scale bands of the IMFs. Simulation results show that, using the proposed features for automatic gender identification considerably improves the performance of the system, in particular in noisy environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.