In this paper we introduce a new double-talk free spoken dialogue interface combining sound field control and a source separation technique based on independent component analysis (ICA). First, sound field control provides silent zones on the microphone elements and prevents the response sound from being observed. In the second step, we propose a novel semi-blind source separation algorithm to suppress the error caused by fluctuation of the room transfer function. By using a direct input of response sound signal to ICA, a source separation problem can be converted to a supervised learning problem. Since the problem becomes easier, the proposed method showed higher performances than the method using blind source separation.
We propose a new blind spatial subtraction array (BSSA) that contains an accurate noise estimator based on independent component analysis (ICA) for the realization of noise-robust hands-free speech recognition. Many previous studies on ICA-based blind source separation often dealt with the special case of speech-speech mixing. However, such a sound mixing is not realistic under common acoustic conditions; the target speech can be approximated to a point source but real noises are often not point sources. Under the condition, our preliminary experiment suggests that the conventional ICA is proficient in the noise estimation rather than the direct speech estimation. Based on the above-mentioned findings, we propose a new noise reduction method that is implemented in subtracting the power spectrum of the estimated noise by ICA from the power spectrum of noise-contaminated observations. This architecture provides us a noise-estimation-error robust speech enhancement rather than a simple linear-filtering-based enhancement. Although nonlinear processing often generates an artificial distortion, the so-called musical noise, it is still applicable to the speech recognition system because the speech decoder is not so sensitive to such a distortion. Experimental results reveal that the proposed BSSA can improve the speech recognition rate by 20% compared with the conventional ICA.
A new two-stage blind source separation (BSS) method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO)-model-based independent component analysis (ICA) and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA enables us to separate the mixed signals, not into monaural source signals but into SIMOmodel-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMOmodel-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, our novel SIMOmodel-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results reveal that the separation performance can be considerably improved by the proposed method compared with that achieved by conventional BSS methods. In addition, the real-time implementation of the proposed BSS is illustrated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.