Robbie Vogt scite author profile

This article describes a general and powerful approach to modelling mismatch in speaker recognition by including an explicit session term in the Gaussian mixture speaker modelling framework. Under this approach, the Gaussian mixture model (GMM) that best represents the observations of a particular recording is the combination of the true speaker model with an additional session-dependent offset constrained to lie in a low-dimensional subspace representing session variability.A novel and efficient model training procedure is proposed in this work to perform the simultaneous optimisation of the speaker model and session variables required for speaker training. Using a similar iterative approach to the Gauss-Seidel method for solving linear systems, this procedure greatly reduces the memory and computational resources required by a direct solution.Extensive experimentation demonstrates that the explicit session modelling provides up to a 68% reduction in detection cost over a standard GMM-based system and significant improvements over a system utilising feature mapping, and is shown to be effective on the corpora of recent National Institute of Standards and Technology (NIST) Speaker Recognition Evaluations, exhibiting different session mismatch conditions.

show abstract

i-vector based speaker recognition on short utterances

Kanagasundaram¹,

Vogt²,

Dean³

et al. 2011

171

View full text Add to dashboard Cite

Experiments in Session Variability Modelling for Speaker Verification

Vogt

Sridharan

View full text Add to dashboard Cite

The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms

et al. 2010

View full text Add to dashboard Cite

Making Confident Speaker Verification Decisions With Minimal Speech

Vogt

Sridharan

Mason

2010

IEEE Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Robbie Vogt

Explicit modelling of session variability for speaker verification

i-vector based speaker recognition on short utterances

Experiments in Session Variability Modelling for Speaker Verification

The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms

Making Confident Speaker Verification Decisions With Minimal Speech

Contact Info

Product

Resources

About