2007
DOI: 10.1016/j.neucom.2007.08.003
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic feature-based transformation for speaker verification over telephone networks

Abstract: Feature transformation aims to reduce the effects of channel-and handset-distortion in telephone-based speaker verification. This paper compares several feature transformation techniques and evaluates their verification performance and computation time under the 2000 NIST speaker recognition evaluation protocol. Techniques compared include feature mapping (FM), stochastic feature transformation (SFT), blind stochastic feature transformation (BSFT), feature warping (FW), and short-time Gaussianization (STG). Th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2007
2007
2020
2020

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 19 publications
(9 citation statements)
references
References 17 publications
0
9
0
Order By: Relevance
“…For the acoustic GMM-UBM system [1], we applied several channel compensation techniques, including feature warping [40], Z-norm [41], short-time Gaussianization (STG) [42] and fast blind stochastic feature transformation (fBSFT) [43]. Acoustic scores S GMM-UBM were computed based on the loglikelihood ratio:…”
Section: Scoring Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For the acoustic GMM-UBM system [1], we applied several channel compensation techniques, including feature warping [40], Z-norm [41], short-time Gaussianization (STG) [42] and fast blind stochastic feature transformation (fBSFT) [43]. Acoustic scores S GMM-UBM were computed based on the loglikelihood ratio:…”
Section: Scoring Methodsmentioning
confidence: 99%
“…Speaker Detection Performance For (a), short-time Gaussianization (STG) and fast blind stochastic feature transformation (fBSFT) [43] were applied to the low-level features, and for (b) feature warping was applied.…”
Section: First Principal Axis Second Principal Axismentioning
confidence: 99%
“…The MFCCs and delta MFCCs were concatenated to form 38-dimensional feature vectors. Cepstral mean subtraction (CMS), fast blind stochastic features transformation (fBSFT) [26], [3] and short-time Gaussianization (STG) [27] were applied to the MFCCs to remove channel effects.…”
Section: A Speech Corpora and Speech Featuresmentioning
confidence: 99%
“…Making the low-level features robust, however, does not come without a price. It has been shown recently that using STG and fast BSFT as feature preprocessors requires 52 seconds to process a 53-second utterance on a Pentium IV 3.2GHz CPU, whereas processing the same utterance by the less powerful cepstral mean subtraction takes only 0.02 seconds [3].…”
Section: Choice Of Relevance Factorsmentioning
confidence: 99%
See 1 more Smart Citation