Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1891
|View full text |Cite
|
Sign up to set email alerts
|

A Study of x-Vector Based Speaker Recognition on Short Utterances

Abstract: This paper explores how the in-and out-domain probabilistic linear discriminant analysis (PLDA) speaker verification behave when enrolment and verification lengths are reduced. Experiment studies have found that when full-length utterance is used for evaluation, in-domain PLDA approach shows more than 28% improvement in EER and DCF values over out-domain PLDA approach and when short utterances are used for evaluation, the performance gain of in-domain speaker verification reduces at an increasing rate. Novel m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
61
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 62 publications
(65 citation statements)
references
References 15 publications
4
61
0
Order By: Relevance
“…1(c)), EERs are better for the former, which could be because phonetic content variability limited to only 12 sentences. For comparison, the authors in [30] report EER increasing from 2.5% to more than 20% when going from full-length recordings to 5sec versus 5sec trials on NIST 2010 corpora.…”
Section: Speaker Verificationmentioning
confidence: 99%
“…1(c)), EERs are better for the former, which could be because phonetic content variability limited to only 12 sentences. For comparison, the authors in [30] report EER increasing from 2.5% to more than 20% when going from full-length recordings to 5sec versus 5sec trials on NIST 2010 corpora.…”
Section: Speaker Verificationmentioning
confidence: 99%
“…State-of-the-art ASV systems exhibit satisfactory performance with adequately long ( 2 minutes) speech data. However, reduction in amount of speech drastically degrades the ASV performance [10,12,18,19,20]. The requirement of sufficiently long speech for training or testing, especially in presence of large intersession variability has limited the potential of widespread real-world implementations.…”
Section: Short Utterance In Speaker Recognitionmentioning
confidence: 99%
“…ASV is undisputedly a crucial technology for biometric identification, which is broadly applied in real-world applications like banking and home automation. Considerable performance improvements in terms of both accuracy and efficiency of ASV systems have been achieved through active research in a diversity of approaches [1][2][3][4][5][6]. [4] proposed a method that use the Gaussian mixture model to extract acoustic features and then apply the likelihood ratio for scoring.…”
Section: Introductionmentioning
confidence: 99%