“…In the past few decades, machine learning, especially deep learning, has achieved remarkable breakthroughs in a wide range of speech tasks, e.g., speech recognition [1,2], speaker verification [3,4,5], language identification [6,7] and emotion classification [8,9]. Each speech task has its own specific techniques in achieving the state-of-the-art results [3,6,8,10,11,12], which require efforts of a large number of experts. Thus, it is very difficult to switch between different speech tasks without human efforts.…”