2017
DOI: 10.48550/arxiv.1711.04121
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 16 publications
0
2
0
Order By: Relevance
“…We use a conditional GAN 38 , as the generative model in our generative decoder (for ease of presentation, we abuse a bit the notation of G(•) to also denote the generative model), which is adopted from the implementation of 44 and pre-trained using digits in the cGAN dataset. Note that this is the only supervision we have in this task, which is even weaker than the general concept of the weakly-supervised setting 45 . Given the shape (z i,j ) and probability embeddings (P i and Q i ), DRNets estimate the two digits in the cell by computing the expected digits over P i and Q i , i.e., n j=1 P i,j G(z i,j ) and n j=1 Q i,j G(z i,j+4 ), and remix them to reconstruct the original input mixture (Fig.…”
Section: Competing Interestsmentioning
confidence: 99%
“…We use a conditional GAN 38 , as the generative model in our generative decoder (for ease of presentation, we abuse a bit the notation of G(•) to also denote the generative model), which is adopted from the implementation of 44 and pre-trained using digits in the cGAN dataset. Note that this is the only supervision we have in this task, which is even weaker than the general concept of the weakly-supervised setting 45 . Given the shape (z i,j ) and probability embeddings (P i and Q i ), DRNets estimate the two digits in the cell by computing the expected digits over P i and Q i , i.e., n j=1 P i,j G(z i,j ) and n j=1 Q i,j G(z i,j+4 ), and remix them to reconstruct the original input mixture (Fig.…”
Section: Competing Interestsmentioning
confidence: 99%
“…Semi-supervised separation methods based on generative adversarial learning were proposed in [38], [39]. The key assumption of these methods is that estimated sources produced by an optimal separator should be indistinguishable from real sound sources, i.e., they should be samples drawn from the same distribution.…”
Section: Introductionmentioning
confidence: 99%