Interspeech 2011 2011
DOI: 10.21437/interspeech.2011-64
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of speaker recognition approaches for real applications

Abstract: This paper describes the experimental setup and the results obtained using several state-of-the-art speaker recognition classifiers. The comparison of the different approaches aims at the development of real world applications, taking into account memory and computational constraints, and possible mismatches with respect to the training environment. The NIST SRE 2008 database has been considered our reference dataset, whereas nine commercially available databases of conversational speech in languages different… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 51 publications
(28 citation statements)
references
References 7 publications
0
21
0
Order By: Relevance
“…We create scores for all fine-tuned systems using adaptive s-normalization [18] with an imposter cohort size of 100. The imposter cohort consists of the average of the length-normalized utterance-based embeddings of each training speaker.…”
Section: Large Margin Fine-tuningmentioning
confidence: 99%
“…We create scores for all fine-tuned systems using adaptive s-normalization [18] with an imposter cohort size of 100. The imposter cohort consists of the average of the length-normalized utterance-based embeddings of each training speaker.…”
Section: Large Margin Fine-tuningmentioning
confidence: 99%
“…In contrast to the methods reviewed thus far, other methods attempt to overcome domain mismatch given existing embeddings. Score Normalization (SN) takes a single trial (enrollment, test recordings, and corresponding matching score) and normalizes the score according to the score distributions of the enrollment/test recordings with respect to an imposter dataset [17]. PLDA Adaptation (PA) [18] uses an unlabeled adaptation dataset to modify the PLDA model used to score the samples.…”
Section: Related Workmentioning
confidence: 99%
“…The most common way of calculating a similarity between a pair of face embeddings is to simply use a cosine similarity: sim(•, •) ≡ cos (•, •). Nevertheless, we have empirically observed that normalizing the cosine similarities with Adaptive-Symmetric NORMalization (as-norm) [32] allows to largely avoid the false positive verification errors (which are the most costly ones according to the challenges' metrics). Therefore, we employ as-norm for score normalization in our final pipeline (the IJC-B dataset [33] containing about 3500 identities is used to calculate the reference cohort).…”
Section: Calculation Of Similarity Between Face Embeddingsmentioning
confidence: 99%