2018
DOI: 10.1186/s13634-018-0576-2
|View full text |Cite
|
Sign up to set email alerts
|

Blind source separation with optimal transport non-negative matrix factorization

Abstract: Optimal transport as a loss for machine learning optimization problems has recently gained a lot of attention. Building upon recent advances in computational optimal transport, we develop an optimal transport non-negative matrix factorization (NMF) algorithm for supervised speech blind source separation (BSS). Optimal transport allows us to design and leverage a cost between short-time Fourier transform (STFT) spectrogram frequencies, which takes into account how humans perceive sound. We give empirical eviden… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…This makes it possible to deal with measures computationally, without the need to discretize them on a grid of predefined locations. Further theoretical investigations are needed and the relationship between the proposed construction and existing spectral transport approaches for audio signal processing [58,59] should be investigated. In future work, the author will apply this framework to the approximation and reconstruction of signals and images [60][61][62].…”
Section: Resultsmentioning
confidence: 99%
“…This makes it possible to deal with measures computationally, without the need to discretize them on a grid of predefined locations. Further theoretical investigations are needed and the relationship between the proposed construction and existing spectral transport approaches for audio signal processing [58,59] should be investigated. In future work, the author will apply this framework to the approximation and reconstruction of signals and images [60][61][62].…”
Section: Resultsmentioning
confidence: 99%
“…Especially, when comparing two non-overlapping distributions (distributions with nonoverlapping support), Wasserstein distance can still provide a smooth and meaningful measure, which is a desirable property that square loss and other divergence losses cannot offer (Weng, 2019), (Schmitz et al, 2018a). Since the first application of Wasserstein loss in solving NMF problems in Sandler and Lindenbaum, 2011, it has been successfully applied to blind source decomposition (Rolet et al, 2018), dictionary learning (Rolet et al, 2016), (Schmitz et al, 2018b), and multilabel supervised learning problems.…”
Section: Introductionmentioning
confidence: 99%
“…Using probability distances for PSDs is also supported by the work of Basseville in [9], who in the late 1980's had already categorised the distances for PSDs as probability based or frequency based. In the same line, the use of the Wasserstein distance on the space of PSDs is not new, related works include those of Flamary et al in [25], which built a dictionary of fundamental and harmonic frequencies thus emphasising the importance of moving mass along the frequency dimension, and also [41], that followed the same rationale for supervised speech blind source separation. However, besides specific-purpose previous works, the open literature is still lacking a systematic study of the properties of this distance and its competence in general-purpose time series applications.…”
Section: Introductionmentioning
confidence: 99%