2007
DOI: 10.1109/tasl.2007.901310
|View full text |Cite
|
Sign up to set email alerts
|

Soft Mask Methods for Single-Channel Speaker Separation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
102
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 102 publications
(102 citation statements)
references
References 16 publications
0
102
0
Order By: Relevance
“…It is worth mentioning here that the estimation algorithm presented in this section is similar in some aspects to other algorithms proposed in the literature for feature compensation [18,19,33], model decomposition [43,47] and single-channel speaker separation [40,41]. Nevertheless, contrary to previous work, the problem we address here is that of speech feature enhancement for noise-robust ASR under the assumption that the corrupting source (noise) is distributed according to a GMM.…”
Section: Spectral Reconstruction Using the Masking Modelmentioning
confidence: 99%
“…It is worth mentioning here that the estimation algorithm presented in this section is similar in some aspects to other algorithms proposed in the literature for feature compensation [18,19,33], model decomposition [43,47] and single-channel speaker separation [40,41]. Nevertheless, contrary to previous work, the problem we address here is that of speech feature enhancement for noise-robust ASR under the assumption that the corrupting source (noise) is distributed according to a GMM.…”
Section: Spectral Reconstruction Using the Masking Modelmentioning
confidence: 99%
“…a time-varying Wiener filter, or a more extreme binary time-frequency mask) to recover an estimate of the original target source. Related approaches have been investigated by several other researchers, including [3], who derive soft masks from the posterior probabilities of each cell belonging to a particular source, [4], who learn separate but coupled models for multiple frequency subbands, and [5] who infer distributions over the target speech magnitudes.…”
Section: Introductionmentioning
confidence: 99%
“…This is in general an ill-posed problem and can not be solved without further knowledge about the sources or their interrelationship. Possible SCSS methods are mainly divided into two categories: source driven [4][5][6][7] and model-based methods [12][13][14][15][16][17][18]. As a major example for the first group, computational auditory scene analysis (CASA) has widely been studied [4].…”
Section: Introductionmentioning
confidence: 99%
“…This speaker dependent model is used as source prior knowledge and is applied for separation without considering the interfering component. The most prominent models are vector quantization (VQ) [12], [16], Gaussian mixture models (GMM) [14], [15] and Hidden Marcov models (HMM) [17]. In most recent proposed model-based SCSS techniques, it is assumed that the test speech files are recorded at a condition similar to that of the training phase recording.…”
Section: Introductionmentioning
confidence: 99%