“…SLSE has found many applications in modern-day life, e.g., voice assistants [1], smart homes [2], video conferencing [3], acoustic scene analysis [4] and event detection [5], etc. SLSE methods fall into two categories: the one-stage methods [6], [7] and the two-stage methods [8], [9], [10]. One-stage methods conduct source localization and signal extraction in one step, whereas two-stage methods localize the sources first and separate the signals emitted from the sources in a subsequent step based on the information gained in the first stage.…”