ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9746530
|View full text |Cite
|
Sign up to set email alerts
|

On Loss Functions and Evaluation Metrics for Music Source Separation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 25 publications
0
6
0
Order By: Relevance
“…where Y s denotes the ground-truth magnitude spectrogram of a target source. For an investigation of various loss functions used with the UMX network, we refer to [46]. As indicated in Table I, UMX is the model with fewest parameters among different approaches.…”
Section: B Open-unmix (Umx)mentioning
confidence: 99%
“…where Y s denotes the ground-truth magnitude spectrogram of a target source. For an investigation of various loss functions used with the UMX network, we refer to [46]. As indicated in Table I, UMX is the model with fewest parameters among different approaches.…”
Section: B Open-unmix (Umx)mentioning
confidence: 99%
“…In this setting, a mixture is fed to a parametric model (i.e., a neural network) that outputs the separated sources. Training is typically performed in a supervised manner by matching the estimated separations with the ground truth sources with a regression loss (e.g., L 1 or L 2 ) (Gusó et al 2022). Supervised regression has been applied to image source separation (Halperin, Ephrat, and Hoshen 2019), but it has been mainly investigated in the audio domain, where two approaches are prevalent: the mask-based approach and the waveform approach.…”
Section: Regression-based Source Separationmentioning
confidence: 99%
“…For example, a recent paper found that while their low-rate speech coder per- formed worse according to POLQA [270], a subjective evaluation showed that their coder had similar performance to competing high-rate coders [271]. More generally, objective measures are known to not correlate well with subjective quality [28,29]. Hence we used a subjective test for the evaluation of our speech enhancement system.…”
Section: Subjective Evaluationmentioning
confidence: 99%
“…The SI-SNR is also used to evaluate and compare these models. Importantly, [28,29] recently showed that the SI-SNR is not well correlated with perceptual quality for speech separation systems. Hence, the field is potentially comparing models using a measure not indicative of perceptual quality.…”
Section: Investigation Of the Relationship Between Si-snr And Percept...mentioning
confidence: 99%
See 1 more Smart Citation