Automatic speaker recognition has become a well-established technique for forensic applications. Since ambient recordings in such applications are obtained with hidden microphones far away from the sound sources, the performance of the speaker recognition can be severely degraded. In this paper, we propose an array signal processing method to compensate for these disturbances by spatially separating the present individual speakers and noise using convolutive Independent Component Analysis and applying a noise-suppression method based on spectral subtraction to the separated sound signals. A speaker recognition scheme based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models is then applied to the separated and noise-cancelled signals. Our proposed pre-processing method dramatically increases the reliability of speaker recognition under such aggravated conditions and outperforms state-of-the-art solutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.