2017
DOI: 10.1186/s13636-017-0120-6
|View full text |Cite
|
Sign up to set email alerts
|

A robust polynomial regression-based voice activity detector for speaker verification

Abstract: Robustness against background noise is a major research area for speech-related applications such as speech recognition and speaker recognition. One of the many solutions for this problem is to detect speech-dominant regions by using a voice activity detector (VAD). In this paper, a second-order polynomial regression-based algorithm is proposed with a similar function as a VAD for text-independent speaker verification systems. The proposed method aims to separate steady noise/silence regions, steady speech reg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 39 publications
0
3
0
Order By: Relevance
“…The crucial point is normalizing the filter-bank magnitude values. In this work, a recently proposed [8] voice activity detector is utilized to detect speech regions for the normalization. For each filter, average speech magnitude is calculated by using these regions.…”
Section: Normalization and Scalingmentioning
confidence: 99%
See 1 more Smart Citation
“…The crucial point is normalizing the filter-bank magnitude values. In this work, a recently proposed [8] voice activity detector is utilized to detect speech regions for the normalization. For each filter, average speech magnitude is calculated by using these regions.…”
Section: Normalization and Scalingmentioning
confidence: 99%
“…There are many energy normalization techniques in the literature, such as in [6] and [7]; however, it is unclear whether a normalization step is added in many speech/speaker recognition studies. We utilized a recently proposed voice activity detection algorithm [8] to detect speech regions in each filter of the filter bank separately and apply energy normalization based on these regions.…”
Section: Introductionmentioning
confidence: 99%
“…Chiang [4] discussed a parametric prosody coding approach for Mandarin discourse utilizing a various leveled prosodic model expressed that past research a novel parametric prosody coding approach for Mandarin discourse is proposed. Disken et al [5] proposed an algorithm showed superior verification performance both with the conventional GMM-universal background model and universal background model (UBM) method, and the state of-the-art i-vector method. Farahani [6] discussed the robust features extractions using autocorrelation domain for noisy speech recognition.…”
Section: Literature Reviewmentioning
confidence: 99%