“…With the current rapid strides in neural networks (NNs) and deep learning, several sophisticated architectures have been proposed and successfully used for single-channel source separation [1,2,3,4]. More recently, we have started to operate directly on the waveforms with several end-to-end approaches available [2,5,6], and use better cost-functions motivated by the Source-to-Distortion ratio (SDR) [7,8,9,10,11,2]. Using deep-clustering [1] and permutation-invariant training [12], we can train the networks to perform speaker-independent source separation.…”