2008
DOI: 10.1016/j.csl.2007.05.003
|View full text |Cite
|
Sign up to set email alerts
|

Explicit modelling of session variability for speaker verification

Abstract: This article describes a general and powerful approach to modelling mismatch in speaker recognition by including an explicit session term in the Gaussian mixture speaker modelling framework. Under this approach, the Gaussian mixture model (GMM) that best represents the observations of a particular recording is the combination of the true speaker model with an additional session-dependent offset constrained to lie in a low-dimensional subspace representing session variability.A novel and efficient model trainin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
116
1
2

Year Published

2011
2011
2017
2017

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 110 publications
(119 citation statements)
references
References 13 publications
0
116
1
2
Order By: Relevance
“…The subspace U is estimated using an expectation-maximization (EM) algorithm. For details of the ISV technique for face-verification, please refer to the works of Wallace et al [3] and Vogt et al [30]. When enrolling a new client, i, using a set of enrollment images (indexed by j), the latent variables x i,j and z i are estimated from the enrollment images, and finally, the client-specific supervector, c i , is computed as:…”
Section: Gmm-based Fr Using Inter-session Variability Modelingmentioning
confidence: 99%
“…The subspace U is estimated using an expectation-maximization (EM) algorithm. For details of the ISV technique for face-verification, please refer to the works of Wallace et al [3] and Vogt et al [30]. When enrolling a new client, i, using a set of enrollment images (indexed by j), the latent variables x i,j and z i are estimated from the enrollment images, and finally, the client-specific supervector, c i , is computed as:…”
Section: Gmm-based Fr Using Inter-session Variability Modelingmentioning
confidence: 99%
“…Many techniques have been proposed with the most notable systems based on Gaussian mixture model (GMM), inter-session variability (ISV) modeling [10], joint factor analysis (JFA) [16], and i-vectors [11].…”
Section: Vulnerability Of Voice Biometricsmentioning
confidence: 99%
“…To demonstrate vulnerability of ASV systems to presentation attacks, we consider two systems based on inter-session variability (ISV) modeling [10] and ivectors [11], which are the state of the art speaker verification systems able to effectively deal with intra-class and inter-class variability. In these systems, voice activity detection is based on the modulation of the energy around 4Hz, the features include 20 mel-scale frequency coefficients (MFCC) and energy, with their first and second derivatives, and modeling was performed with 256 Gaussian components using 25 expectation-maximization (EM) iterations.…”
Section: Vulnerability Of Voice Biometricsmentioning
confidence: 99%
See 1 more Smart Citation
“…The most commonly used acoustic vectors are Mel Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficients (LPCC) and Perceptual Linear Prediction Cepstral (PLPC) Coefficients and zero crossing coefficients (Yegnanarayana et al, 2005;Vogt et al, 2005). All these features are based on the spectral information derived from a short time windowed segment of speech.…”
Section: Literature Reviewmentioning
confidence: 99%