Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1463
|View full text |Cite
|
Sign up to set email alerts
|

Assessing Speaker Engagement in 2-Person Debates: Overlap Detection in United States Presidential Debates

Abstract: Co-channel speech recordings typically contain significant amounts of overlap in which the intelligibility and quality of the desired speech is degraded by interference from a competing talker. Convolutive Non-negative Matrix Factorization (CNMF) has been shown to be a successful approach in detecting overlap by extracting specific acoustic basis dimensions for each speaker from an audio stream. While the results of CNMF have been successful, it requires isolated single speech recordings for each speaker to de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
1
1

Relationship

6
3

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 9 publications
0
8
0
Order By: Relevance
“…The GRID is a multi-speaker, sentence corpus [32], which has been used in monaural speech separation and recognition challenge [33]. Additionally, this corpus has been widely used for assessing the perception of simultaneous speech signals [34,35,36]. This corpus consists of 34 subjects (18 male and 16 female speakers), each narrating 1000 sentences.…”
Section: Experiments Results and Discussionmentioning
confidence: 99%
“…The GRID is a multi-speaker, sentence corpus [32], which has been used in monaural speech separation and recognition challenge [33]. Additionally, this corpus has been widely used for assessing the perception of simultaneous speech signals [34,35,36]. This corpus consists of 34 subjects (18 male and 16 female speakers), each narrating 1000 sentences.…”
Section: Experiments Results and Discussionmentioning
confidence: 99%
“…We generate overlapping speech utterances based on the GRID corpus, which is a multi-speaker, sentence-based corpus used in a monaural speech separation and recognition challenge [42]. This corpus contains 34 speakers, 16 female and 18 male speakers, each providing 1000 sentences, which have been frequently used in several overlapping speech detection and separation studies [43], [44], [45], [37]. To generate mixed speech utterances, random speech recordings are selected from random speakers.…”
Section: Experiments a Datasetmentioning
confidence: 99%
“…In this work we use a multi-speaker, sentence-based corpus called GRID, which has been used in monaural speech separation and recognition challenge [18]. Also, this dataset has been used in several studies [12,19] for overlapping speech detection and separation. This corpus contains 34 speakers, 16 female and 18 male speakers, each narrating 1000 sentence.…”
Section: Problem Formulationmentioning
confidence: 99%