“…However, these datasets are relatively small (all these combined are around 15 hours) and not diverse. To artificially increase the size of the dataset, [6,16,17] apply data augmentation to signal including random channel swapping, amplitude scaling, remixing sources from different songs, time-stretching, pitch shifting, and filtering. These methods, individually or combined, are empirically shown to enhance separation performance only by a limited margin [6].…”