Geometric source separation: merging convolutive source separation with geometric beamforming

Parra, Lucas C.; Alvino, Christopher

doi:10.1109/nnsp.2001.943132

Cited by 88 publications

(148 citation statements)

References 12 publications

Supporting

Mentioning

148

Contrasting

Order By: Relevance

“…In effect, the good SIR of the strongest source in the original mixtures cannot be improved upon by the separation algorithm, whereas the improvements obtained in the weakest source in the original mixtures can be significant, even for very low initial SIRs. Moreover, both the frequency-domain decorrelation methods in [2,3] and the frequency domain version of the scaled natural gradient algorithm provide very little separation when there is a high power imbalance in the original signal mixtures. For these frequency-domain algo- Output SIR for the weakest user − 3 source case STFICA [7] STFICA−Symm [7] PARRA [2] GEOBSS [3] TRUNC−NG [5] SNGTD [8] SNGFD [8] Fig.…”

Section: Resultsmentioning

confidence: 99%

“…The algorithms chosen can loosely be classified into groups according to the separation criterion used: (1) decorrelation-based methods, (2) information-theoretic methods; and (3) contrastbased methods. The second-order statistics-based frequency domain joint decorrelation algorithm of Parra and Spence [2] and its beamforming constrained version [3] attempt to jointly diagonalize correlation matrices as measured from the mixtures. The natural gradient algorithms presented in [5] and [8] attempt to minimize the mutual information of the extracted signals using frequency-domain and time-domain system structures, respectively.…”

Section: Technical Rationalementioning

confidence: 99%

“…Parra's decorrelation-based method with beamforming constraints (Eq. 16, [3])) with two sources employed the parameters K = 5, N = 43, L = 512, and a 1000-iteration gradient update limit. For the three-source mixtures, all of the Comparison of CBSS Algorithms in Power Imbalance − 2 source case STFICA [7] STFICA−Symm [7] PARRA [2] GEOBSS [3] TRUNC−NG [5] SNGTD [8] SNGFD [8] Weaker talker Stronger talker Fig.…”

Section: Technical Rationalementioning

confidence: 99%

“…above parameters were kept the same except the algorithms in [2] and [3], for which the value of N was changed to 32 and 65, respectively. After separation, least-squares methods were used to estimate a 2048-tap long channel impulse response given the original source recording and the separated source for each system output.…”

Section: Technical Rationalementioning

confidence: 99%

See 3 more Smart Citations

Performance Evaluation of Convolutive Blind Source Separation of Mixtures of Unequal-Level Speech Signals

Gupta

Douglas

2007

2007 IEEE International Symposium on Circuits and Systems (ISCAS)

View full text Add to dashboard Cite

The performance of any signal enhancement method depends on the relative amplitudes of the signals and interferences present in the original measurements. Previous evaluations of convolutive blind source separation methods for speech enhancement have considered situations where the signal-tointerference ratios (SIRs) of all the talkers' speech signals are nearly equal in the recorded signal mixtures. This paper presents real-world separation experiments of two-and threetalker signal mixtures in which the level of one talker's speech is lower than those of the others. Methods based on decorrelation, frequency-domain information maximization, and timedomain contrast optimization are studied. Experimental evaluation shows that the weaker talker's speech receives the most enhancement in the separated system's outputs, whereas the stronger talker's speech signals receive moderate to little enhancement beyond a limiting value.Ê

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Technical Rationalementioning

confidence: 99%

Section: Technical Rationalementioning

confidence: 99%

Section: Technical Rationalementioning

confidence: 99%

See 2 more Smart Citations

Performance Evaluation of Convolutive Blind Source Separation of Mixtures of Unequal-Level Speech Signals

Gupta

Douglas

2007

2007 IEEE International Symposium on Circuits and Systems (ISCAS)

View full text Add to dashboard Cite

show abstract

“…Due to the spatial directivity, it can also mitigate the effect of reverberation which causes a field of dispersed signals. The limitation of beamforming is that separation is not possible when multiple sounds come from directions that are the same or near to each other (Wolfel and McDonough, 2009;Parra and Alvino, 2002).…”

Section: Prior Workmentioning

confidence: 99%

Computational methods for underdetermined convolutive speech localization and separation via model-based sparse component analysis

Asaei

Bourlard

Taghizadeh

et al. 2016

Speech Communication

View full text Add to dashboard Cite

In this paper, the problem of speech source localization and separation from recordings of convolutive underdetermined mixtures is studied. The problem is cast as recovering the spatio-spectral speech information embedded in a microphone array compressed measurements of the acoustic field. A model-based sparse component analysis framework is formulated for sparse reconstruction of the speech spectra in a reverberant acoustic resulting in joint localization and separation of the individual sources. We compare and contrast the computational approaches to model-based sparse recovery exploiting spatial sparsity as well as spectral structures underlying spectrographic representation of speech signals. In this context, we explore identification of the sparsity structures at the auditory and acoustic representation spaces. The auditory structures are formulated upon the principles of structural grouping based on proximity, autoregressive correlation and harmonicity of the spectral coefficients and they are incorporated for sparse reconstruction. The acoustic structures are formulated upon the image model of multipath propagation and they are exploited to characterize the compressive measurement matrix associated with microphone array recordings.Three approaches to sparse recovery relying on combinatorial optimization, convex relaxation and Bayesian methods are studied and evaluated based on thorough experiments. The sparse Bayesian learning method is shown to yield better perceptual quality while the interference suppression is also achieved using the combinatorial approach with the advantage of offering the most efficient computational cost. Furthermore, it is demonstrated that an average autoregressive model can be learned for speech localization and exploiting the proximity structure in the form of block sparse coefficients enables accurate localization. Throughout the extensive empirical evaluation, we confirm that a large and random placement of the microphones enables significant improvement in source localization and separation performance.

show abstract

Enhancing listening capability of humanoid robot by reduction of stationary ego‐noise

Wake

Fukumoto

Takahashi

et al. 2019

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

Speech interfaces for household robots utilizing third‐party automatic speech recognition (ASR) services face the challenge of overcoming stationary ego‐noise that decreases ASR accuracy. Previous studies on signal processing have proposed numerous noise reduction methods that increase the signal‐to‐noise ratio of speech audio and subjective speech clarity. However, severe limitations on the cost of hardware of household robots and the use of closed ‘black box’ ASR services require us to re‐examine the efficacy of noise reduction methods in this context. Here we compare the effect of several basic noise filters on the performance of ASR services when speech sounds include the stationary ego‐noise of a humanoid Pepper robot. The result revealed that a spectrum subtraction filter improves the accuracy of ASR services best. We also demonstrate that the filter improves ASR performance on an actual Pepper robot system. This study not only provides practical knowledge on the selection of noise filters for a robot system but also discusses further improvements to the listening capabilities of the robot utilizing ASR. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

show abstract

Geometric source separation: merging convolutive source separation with geometric beamforming

Cited by 88 publications

References 12 publications

Performance Evaluation of Convolutive Blind Source Separation of Mixtures of Unequal-Level Speech Signals

Performance Evaluation of Convolutive Blind Source Separation of Mixtures of Unequal-Level Speech Signals

Computational methods for underdetermined convolutive speech localization and separation via model-based sparse component analysis

Enhancing listening capability of humanoid robot by reduction of stationary ego‐noise

Contact Info

Product

Resources

About