Two of the major challenges in microphone array based adaptive beamforming, speech enhancement and distant speech recognition, are robust and accurate source localization and voice activity detection. This paper introduces a spatial gradient steered response power using the phase transform (SRP-PHAT) method which is capable of localization of competing speakers in overlapping conditions. We further investigate the behavior of the SRP function and characterize theoretically a fixed point in its search space for the diffuse noise field. We call this fixed point the null position in the SRP search space. Building on this evidence, we propose a technique for multichannel voice activity detection (MVAD) based on detection of a maximum power corresponding to the null position. The gradient SRP-PHAT in tandem with the MVAD form an integrated framework of multi-source localization and voice activity detection. The experiments carried out on real data recordings show that this framework is very effective in practical applications of hands-free communication.
Abstract-Performance of Adaptive Noise Cancellation (ANC) degrades severely when uncorrelated noise components are present at the two inputs. Thus, practical background diffuse noises pose a serious problem for ANC systems. In this research, we propose a new hybrid system that integrates Subband Adaptive Filters (SAFs) and a Wiener filter. The hybrid system is implemented on an oversampled DFT filterbank that efficiently integrates the SAF and the Wiener filter components in the frequency-domain. Performance evaluation of the hybrid system in presence of diffuse noise interference shows that the proposed system is superior to both the Wiener filter and the SAF subsystems.
In some applications, the signals received by an array are a mixture of signals emitted by both far-field and near-field sources. This study develops a new cumulant-based multiple signal classification (MUSIC) algorithm for source localisation using a new structural sparse array for scenarios where both far-field and near-field sources coexist. The key feature of this algorithm is that it utilises fourth-order cumulants to compute the virtual covariance matrix and constructs a new special cumulant matrix to acquire the largest number of virtual sensors and the largest array aperture for a given number of sensors. The authors provide a geometric proof to justify the utilisation of the proposed sparse linear array and compute the effective aperture of the array. The proposed algorithm increases resolution ability, direction of arrival (DOA) and range estimation accuracy, and the number of sources to be localised. Moreover, the new method has the main advantage that it does not use the information of all sensors; so that it provides somewhat low computational complexity while it uses many actual and virtual sensors. Monte Carlo simulations are provided to demonstrate the effectiveness of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.