ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053426
|View full text |Cite
|
Sign up to set email alerts
|

Continuous Speech Separation: Dataset and Analysis

Abstract: This paper describes a dataset and protocols for evaluating continuous speech separation algorithms. Most prior studies on speech separation use pre-segmented signals of artificially mixed speech utterances which are mostly fully overlapped, and the algorithms are evaluated based on signal-to-distortion ratio or similar performance metrics. However, in natural conversations, a speech signal is continuous, containing both overlapped and overlap-free components. In addition, the signal-based metrics have very we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

2
164
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 177 publications
(166 citation statements)
references
References 45 publications
2
164
0
Order By: Relevance
“…In the second experiment, we used the meeting-like LibriCSS corpus [18], which consists of 8-speaker meeting-like recordings sessions of 10 minutes, obtained by re-recording LibriSpeech utterances played through loudspeakers in a meeting room. The overlap ratio varies from 0 to 40 %.…”
Section: Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…In the second experiment, we used the meeting-like LibriCSS corpus [18], which consists of 8-speaker meeting-like recordings sessions of 10 minutes, obtained by re-recording LibriSpeech utterances played through loudspeakers in a meeting room. The overlap ratio varies from 0 to 40 %.…”
Section: Datasetmentioning
confidence: 99%
“…We discuss related works in Section 4. In Section 5, we present experimental results based on the LibriCSS corpus [18]. Finally, we conclude the paper in Section 6.…”
Section: Introductionmentioning
confidence: 99%
“…It contains 10 hours of audio recordings in regular meeting rooms. Each mini-session 1 in 1 Readers can refer to [23] to get more details.…”
Section: Datasetmentioning
confidence: 99%
“…All our models in the table use the window size of 2.4s. 0S/L[23]: 0% overlap ratio with short/long silence.…”
mentioning
confidence: 99%
“…However, the microphone array in this dataset is only a circular array and cannot be changed, so it cannot be applied to the scenes that require a specific shape of the microphone array. Chen et al proposed a dataset [14] for evaluating continuous speech separation. In this dataset, the speech signal is continuous, containing both the overlapped and overlap-free components.…”
Section: Introductionmentioning
confidence: 99%