Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1685
|View full text |Cite
|
Sign up to set email alerts
|

Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization

Abstract: Learning a good speaker embedding is critical for many speech processing tasks, including recognition, verification, and diarization. To this end, we propose a complementary optimizing goal called intra-class loss to improve deep speaker embeddings learned with triplet loss. This loss function is formulated as a soft constraint on the averaged pair-wise distance between samples from the same class. Its goal is to prevent the scattering of these samples within the embedding space to increase the intra-class com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 22 publications
(17 citation statements)
references
References 18 publications
0
17
0
Order By: Relevance
“…with some previous works on the same dataset. EER (%) GMM-UBM [35] 15.0 I-vectors + PLDA [35] 8.8 CNN [35] 7.8 CNN + intra-class + triplet loss [43] 7.9 SincNet [21] 7.2 SincNet+LIM (proposed) 5.8 Table 3: Equal Error Rate (EER%) obtained on speaker verification (using the VoxCeleb corpus).…”
Section: Speaker Verificationmentioning
confidence: 99%
“…with some previous works on the same dataset. EER (%) GMM-UBM [35] 15.0 I-vectors + PLDA [35] 8.8 CNN [35] 7.8 CNN + intra-class + triplet loss [43] 7.9 SincNet [21] 7.2 SincNet+LIM (proposed) 5.8 Table 3: Equal Error Rate (EER%) obtained on speaker verification (using the VoxCeleb corpus).…”
Section: Speaker Verificationmentioning
confidence: 99%
“…Extended ResNet implementation from [15] named dilated residual network (DltResNet) is used as the third speaker 10 Kaldi GitHub: https://github.com/kaldi-asr/kaldi verification methods. The implementation is publicly available 11 .…”
Section: ) Dilated Residual Network (Dltresnet)mentioning
confidence: 99%
“…In [219], a complementary optimizing goal called intra-class loss is proposed to improve deep speaker embeddings learned with triplet loss. It is shown in the paper that models trained using intra-class loss can yield a significant relative reduction of 30% in equal error rate (EER) compared to the original triplet loss.…”
Section: Deep Learning Work On Voice Recognitionmentioning
confidence: 99%