Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1150
|View full text |Cite
|
Sign up to set email alerts
|

A Shifted Delta Coefficient Objective for Monaural Speech Separation Using Multi-task Learning

Abstract: This paper addresses the problem of monaural speech separation for simultaneous speakers. Recent studies such as uPIT, cuPIT-Grid LSTM and their variants have advanced the stateof-the-art separation models. Delta and acceleration coefficients are typically used in the objective function to capture short time dynamics. We consider that such coefficients don't benefit from the temporal information over a long range such as phoneme and syllable. In this paper, we propose a shifted delta coefficient (SDC) objectiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
3

Relationship

5
4

Authors

Journals

citations
Cited by 17 publications
(16 citation statements)
references
References 22 publications
0
16
0
Order By: Relevance
“…In order to better compare the performance of our proposed method (uPIT+DEF+DL) and other separation methods, Table 2 presents the results of SDR (dB) in the other competitive approaches on the same WSJ0-2mix dataset. Note that, for [9,12,25,15,26,13] methods are use SDR improvements results. Therefore, we manually add 0.2 dB to their final results although the SDR result of the mixture is only about 0.15 dB.…”
Section: Comparisons With Other Separation Methodsmentioning
confidence: 99%
“…In order to better compare the performance of our proposed method (uPIT+DEF+DL) and other separation methods, Table 2 presents the results of SDR (dB) in the other competitive approaches on the same WSJ0-2mix dataset. Note that, for [9,12,25,15,26,13] methods are use SDR improvements results. Therefore, we manually add 0.2 dB to their final results although the SDR result of the mixture is only about 0.15 dB.…”
Section: Comparisons With Other Separation Methodsmentioning
confidence: 99%
“…To recover a single-talker speech sample, the monaural speech separation techniques could come in handy. Successful implementations include deep clustering [18], deep attractor network [19], permutation invariant training [20]- [22], Conv-TasNet [23], DPRNN [24]. However, speech separation technique seeks to recover the single-talker speech for each individual, that is not only an overkill for speaker verification, but also difficult particularly when we don't know the number of speakers in the multi-talker speech.…”
Section: Introductionmentioning
confidence: 99%
“…Recent deep learning based methods, such as Deep Clustering (DC) [3][4][5], Deep Attractor Network (DANet) [6], Permutation Invariant Training (PIT) methods [7][8][9][10], have significantly advanced the performance of multi-taker speech separation. However, the number of speaker has to be known Wei Rao contributed to this work before joining National University of Singapore.…”
Section: Introductionmentioning
confidence: 99%