2011
DOI: 10.1016/j.specom.2011.01.002
|View full text |Cite
|
Sign up to set email alerts
|

A multistage approach to blind separation of convolutive speech mixtures

Abstract: In this paper, we propose a novel algorithm for the separation of convolutive speech mixtures using two-microphone recordings, based on the combination of independent component analysis (ICA) and ideal binary mask (IBM), together with a post-filtering process in the cepstral domain. Essentially, the proposed algorithm consists of three steps. First, a constrained convolutive ICA algorithm is applied to separate the source signals from two-microphone recordings. In the second step, we estimate the IBM by compar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 24 publications
(18 citation statements)
references
References 36 publications
0
18
0
Order By: Relevance
“…IdBM techniques often introduce musical noise, caused by errors in the estimation of the time-frequency masks and manifested in isolated T-F units. A number of techniques have been proposed to suppress musical noise distortions introduced by IdBM techniques [32,33]. While musical noise might be distracting to the listeners, it has not been found to be detrimental in terms of speech intelligibility.…”
Section: Relationship Between Proposed Residual Constraints and Tmentioning
confidence: 99%
“…IdBM techniques often introduce musical noise, caused by errors in the estimation of the time-frequency masks and manifested in isolated T-F units. A number of techniques have been proposed to suppress musical noise distortions introduced by IdBM techniques [32,33]. While musical noise might be distracting to the listeners, it has not been found to be detrimental in terms of speech intelligibility.…”
Section: Relationship Between Proposed Residual Constraints and Tmentioning
confidence: 99%
“…The AV atoms are inseparable, i.e., each audio atom and its associated visual atom always appear in pairs at a TS position of the AV sequence. As a result, if a visual atom appears at the TS position , i.e., , there exists a corresponding none-zero coefficient in the set , subject to (2) In the above set, rounds a number to its nearest integer; denotes the same temporal position as with a finer resolution. The coarse TS position and comprise a fine TS position .…”
Section: A Generative Modelmentioning
confidence: 99%
“…To optimise it, first we need to calculate the overall matching criterion using (5), with being tied with via set (2). In the -th iteration of the coding stage, the optimal atom index and the associated translation can therefore be found by maximizing the following objective function: (7) where is associated with as defined in set (2). Then we can set values in the parameter set : (8) Finally, the residual 2 will be updated via: (9) There are iterations in total.…”
Section: ) New Matching Criterionmentioning
confidence: 99%
See 1 more Smart Citation
“…Pitch information is relatively reliable under noisy conditions and can be used to improve the system performance [81]. Another potential direction is to use the property of the sources and noise/interferences, such as sparseness, to facilitate the identification of the reliable regions within the mixture that can be used to estimate the sources [74][75][76][77]. This is mainly due to the following three reasons.…”
Section: Future Directionsmentioning
confidence: 99%