2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) 2016
DOI: 10.1109/atsip.2016.7523167
|View full text |Cite
|
Sign up to set email alerts
|

Cohort selection for text-dependent speaker verification score normalization

Abstract: Subspace based techniques, such as i-vector and Joint Factor Analysis (JFA) have shown to provide state-of-the-art performance for fixed phrase based text-dependent speaker verification. However, the error rates of such systems on the random digit task of RSR dataset are higher than that of Gaussian Mixture Model-Universal Background Model (GMM-UBM). In this paper, we aim at improving i-vector system by normalizing the content of the enrollment data to match the test data. We estimate i-vectors for each frames… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 10 publications
0
3
0
Order By: Relevance
“…The most reliable and simplest form of normalization that is based on the estimation of mean and variance for the genuine or target speaker distribution is Z-norm [29]. The important highlight of the Z-norm is that it doesn't need to perform online permutations during the training process.…”
Section: Zero Normalization (Z-norm)mentioning
confidence: 99%
“…The most reliable and simplest form of normalization that is based on the estimation of mean and variance for the genuine or target speaker distribution is Z-norm [29]. The important highlight of the Z-norm is that it doesn't need to perform online permutations during the training process.…”
Section: Zero Normalization (Z-norm)mentioning
confidence: 99%
“…A cohort speaker's set is one where different speakers utter the same prompt [3]. Here a four cohort speaker's set is considered.…”
Section: Cohort Set Preparationmentioning
confidence: 99%
“…In TD-SV system, the reference and the testing phrase are same. In this case, speakers speak the same text during the training and testing period [2], [3], [4], [5]. On the other hand, in TI-SV system there is no such bound [6].…”
Section: Introductionmentioning
confidence: 99%