Audio Source Separation and Speech Enhancement 2018
DOI: 10.1002/9781119279860.ch14
|View full text |Cite
|
Sign up to set email alerts
|

Gaussian Model Based Multichannel Separation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
1
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 34 publications
(57 reference statements)
1
1
0
Order By: Relevance
“…Note that a particular case where T j = 1 and b j,t (f ) = 1 for all j is equivalent to assuming the norm r j (n) = f |s j (f, n)| 2 follows a complex Gaussian distribution with time-varying variance h j (n). This is analogous to the assumption in IVA that the magnitudes of the STFT coefficients in all frequency bands originating from the same source tend to vary coherently over time [32].…”
Section: A Ilrmasupporting
confidence: 55%
See 1 more Smart Citation
“…Note that a particular case where T j = 1 and b j,t (f ) = 1 for all j is equivalent to assuming the norm r j (n) = f |s j (f, n)| 2 follows a complex Gaussian distribution with time-varying variance h j (n). This is analogous to the assumption in IVA that the magnitudes of the STFT coefficients in all frequency bands originating from the same source tend to vary coherently over time [32].…”
Section: A Ilrmasupporting
confidence: 55%
“…If there is a large number of utterances of a sufficiently wide variety of speakers in the training dataset, the trained model is expected to have an ability to express spectrograms of unseen speakers. When a test mixture contains unseen speakers, (31) can be interpreted as how similar speaker j is to the speakers in the training set, whereas (32) indicates the speaker in the training set most similar to speaker j. A test set was created by randomly mixing two different speakers selected from the WSJ0 folders si_dt_05 and si_et_05, where the number of speakers was 18.…”
Section: G Speaker-independent Separationmentioning
confidence: 99%