Gaussian Model Based Multichannel Separation

Ozerov, Alexey; Kameoka, Hirokazu

doi:10.1002/9781119279860.ch14

Cited by 3 publications

(2 citation statements)

References 34 publications

(57 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that a particular case where T j = 1 and b j,t (f ) = 1 for all j is equivalent to assuming the norm r j (n) = f |s j (f, n)| 2 follows a complex Gaussian distribution with time-varying variance h j (n). This is analogous to the assumption in IVA that the magnitudes of the STFT coefficients in all frequency bands originating from the same source tend to vary coherently over time [32].…”

Section: A Ilrmasupporting

confidence: 55%

“…If there is a large number of utterances of a sufficiently wide variety of speakers in the training dataset, the trained model is expected to have an ability to express spectrograms of unseen speakers. When a test mixture contains unseen speakers, (31) can be interpreted as how similar speaker j is to the speakers in the training set, whereas (32) indicates the speaker in the training set most similar to speaker j. A test set was created by randomly mixing two different speakers selected from the WSJ0 folders si_dt_05 and si_et_05, where the number of speakers was 18.…”

Section: G Speaker-independent Separationmentioning

confidence: 99%

See 1 more Smart Citation

FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method

Kameoka

Inoue

et al. 2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

Section: A Ilrmasupporting

confidence: 55%

Section: G Speaker-independent Separationmentioning

confidence: 99%

FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method

Kameoka

Inoue

et al. 2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

Determined BSS by Combination of IVA and DNN via Proximal Average

Matsumoto,

Yatabe

2024

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Supervised Determined Source Separation with Multichannel Variational Autoencoder

Kameoka

Inoue

et al. 2019

Neural Computation

Self Cite

View full text Add to dashboard Cite

This paper proposes a multichannel source separation technique called the multichannel variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model and estimate the power spectrograms of the sources in a mixture. By training the CVAE using the spectrograms of training examples with source-class labels, we can use the trained decoder distribution as a universal generative model capable of generating spectrograms conditioned on a specified class label. By treating the latent space variables and the class label as the unknown parameters of this generative model, we can develop a convergence-guaranteed semi-blind source separation algorithm that consists of iteratively estimating the power spectrograms of the underlying sources as well as the separation matrices. In experimental evaluations, our MVAE produced better separation performance than a baseline method.Index Terms-Blind source separation, multichannel non-negative matrix factorization, variational autoencoders (VAEs)

show abstract

Gaussian Model Based Multichannel Separation

Cited by 3 publications

References 34 publications

FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method

FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method

Determined BSS by Combination of IVA and DNN via Proximal Average

Supervised Determined Source Separation with Multichannel Variational Autoencoder

Contact Info

Product

Resources

About