Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2285
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Domain-Adversarial Voice Activity Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 30 publications
(16 citation statements)
references
References 0 publications
0
16
0
Order By: Relevance
“…Generally, noise-robust VAD systems are developed using audio from clean speech datasets augmented with different types of noise [21,11,8]. Hebbar et al [11] compiled the Subtitle-Aligned Movie (SAM) corpus, a dataset based on 117 hours of movie audio.…”
Section: Training Datamentioning
confidence: 99%
See 3 more Smart Citations
“…Generally, noise-robust VAD systems are developed using audio from clean speech datasets augmented with different types of noise [21,11,8]. Hebbar et al [11] compiled the Subtitle-Aligned Movie (SAM) corpus, a dataset based on 117 hours of movie audio.…”
Section: Training Datamentioning
confidence: 99%
“…Then a smoothing filter is applied to decide the label for a frame spanned by multiple segments. We examined two common smoothing filters: majority vote (median) [11] and average (mean) [8]. We also experimented with different amounts of overlap ranging from 12.5% to 87.5% to understand the effect of this parameter for frame-level performance.…”
Section: Modelmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to alleviate the effect of loudness difference shown in Table 1, we scale the audio signal for each speaker turn separately, and make sure each turn is in the range of -1 and 1. We apply a domain-adversarial neural network based voice activity detection model [18] for intraspeaker-turn segmentation. We also exclude speaker turns that are shorter than 2 seconds to extract robust acoustic features.…”
Section: Data Preprocessingmentioning
confidence: 99%