2012
DOI: 10.1186/1687-4722-2012-14
|View full text |Cite
|
Sign up to set email alerts
|

Using information theoretic distance measures for solving the permutation problem of blind source separation of speech signals

Abstract: The problem of blind source separation (BSS) of convolved acoustic signals is of great interest for many classes of applications. Due to the convolutive mixing process, the source separation is performed in the frequency domain, using independent component analysis (ICA). However, frequency domain BSS involves several major problems that must be solved. One of these is the permutation problem. The permutation ambiguity of ICA needs to be resolved so that each separated signal contains the frequency components … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
14
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 36 publications
0
14
0
Order By: Relevance
“…For example, STFT coefficients can be modelled in accordance with the Chi [20] and Rayleigh [17] distribution functions. Moreover, the logarithm of the STFT speech magnitude is modelled by a generalized Gaussian distribution in [17]. However, a single pdf class cannot model the distribution of STFT coefficients with high accuracy across all frequencies.…”
Section: Measuring the Similarity Of The Speech Spectrummentioning
confidence: 99%
See 1 more Smart Citation
“…For example, STFT coefficients can be modelled in accordance with the Chi [20] and Rayleigh [17] distribution functions. Moreover, the logarithm of the STFT speech magnitude is modelled by a generalized Gaussian distribution in [17]. However, a single pdf class cannot model the distribution of STFT coefficients with high accuracy across all frequencies.…”
Section: Measuring the Similarity Of The Speech Spectrummentioning
confidence: 99%
“…The key assumption is that the probability density function (pdf) of each frequency bin is similar for all frequency components of a given speech source. Thus, the permutation can be corrected by considering small deviations in the parameters of the pdf model between neighboring bins [16] or by using several distance measures based on Information Theory [17].…”
mentioning
confidence: 99%
“…From the constraint q + q 2 = 1, we obtain 2δ i q i = − q 2 , which implies that the higher-order terms in (10) are o( q 2 ) and can be neglected. Through a substitution, (10) can be simplified as…”
Section: Problem Formulationmentioning
confidence: 99%
“…Conventional BSS approaches do not ensure separation of source signals with a specified order in accordance with their stochastic properties. There are actually three types of source signals in [10,16]; they are known as standard normal, sup-Gaussian, and sub-Gaussian. The kurtosis in [27,32] is used to measure signals of the different distributions.…”
Section: Introductionmentioning
confidence: 99%
“…The amplitude ambiguity is usually solved by the Minimal Distortion Principle [4]. For the permutation ambiguity, there have been several approaches with different degrees of success, using for example the correlation among envelopes [5] or power ratios [2], the generalized coherence function [6], the use of information theoretic distance measures [7], or the pseudoanechoic model for blind source separation [8]. In the last case, the problem was simplified in such a way that it required the robust estimation of the separation matrix for only one frequency bin, which then will serve to estimate the parameters for defining the separation matrices for all bins.…”
Section: Introductionmentioning
confidence: 99%