“…Although many machine learning approaches have been proposed for audio processing [3,8,13,14,18,19,23,24,28,32,33], to the best of our knowledge, there is no existing method that can directly address AMSS (see §3). This paper proposes a novel endto-end neural network that performs AMSS according to the given textual query.…”