2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
DOI: 10.1109/waspaa.2019.8937170
|View full text |Cite
|
Sign up to set email alerts
|

Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity

Abstract: Music source separation performance has greatly improved in recent years with the advent of approaches based on deep learning. Such methods typically require large amounts of labelled training data, which in the case of music consist of mixtures and corresponding instrument stems. However, stems are unavailable for most commercial music, and only limited datasets have so far been released to the public. It can thus be difficult to draw conclusions when comparing various source separation methods, as the differ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
40
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 71 publications
(40 citation statements)
references
References 16 publications
0
40
0
Order By: Relevance
“…The dataset used in this work is the Slakh2100 dataset [5]. The MIDI files are aligned with the audio files and are used as the grund truth picth and duration of the notes.…”
Section: Resultsmentioning
confidence: 99%
“…The dataset used in this work is the Slakh2100 dataset [5]. The MIDI files are aligned with the audio files and are used as the grund truth picth and duration of the notes.…”
Section: Resultsmentioning
confidence: 99%
“…We test the f 0 detection and note tracking algorithm (Algorithm 1) in the Slakh2100 dataset [31] which provides MIDI files and their synthesized audio files, and we compare the performance of this approach with the OaF model [4]. It is important to note that the tests have been performed on the same dataset although the OaF model has only been trained with the MAESTRO dataset [17] only for piano transcription.…”
Section: Datasetsmentioning
confidence: 99%
“…For our second method, we train the CNN model for polyphonic transcription with the Slakh2100 dataset [31]. It contains 145 h of mixtures and a total of 2100 automatically mixed tracks and their corresponding MIDI files synthesized.…”
Section: Datasetsmentioning
confidence: 99%
“…We used the Slakh2100-split2 (Slakh) [62] and the RWC Popular Music Database (RWC) [63] for evaluation because these datasets include ground-truth beat times. The Slakh dataset contains 2100 musical pieces in which the audio signals were synthesized from the Lakh MIDI dataset [64] using professional-grade virtual instruments, and the RWC dataset contains 100 Japanese popular songs.…”
Section: Evaluation Datamentioning
confidence: 99%