Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-11301
|View full text |Cite
|
Sign up to set email alerts
|

Barlow Twins self-supervised learning for robust speaker recognition

Abstract: Acoustic noise is a big challenge for speaker recognition systems. The state-of-the-art speaker recognition systems are based on deep neural network speaker embeddings called xvector extractor. A noise-robust x-vector extractor is highly demanded in speaker recognition systems. In this paper, we introduce Barlow Twins self-supervised loss function in the area of speaker recognition. Barlow Twins objective function tries to optimize two criteria: Firstly, it increases the similarity between two versions of the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 15 publications
0
1
0
Order By: Relevance
“…• Introduce a novel personalized encoding method using each data subject's sound samples to build their profiles based on pre-training with contrastive learning and can be applied to all users through semi-supervised learning. [6,7,8,9] regarding the applicability and efficacy of such models in real-world scenarios. However, it is still an under-research problem in the sound domain.…”
Section: Introductionmentioning
confidence: 99%
“…• Introduce a novel personalized encoding method using each data subject's sound samples to build their profiles based on pre-training with contrastive learning and can be applied to all users through semi-supervised learning. [6,7,8,9] regarding the applicability and efficacy of such models in real-world scenarios. However, it is still an under-research problem in the sound domain.…”
Section: Introductionmentioning
confidence: 99%
“…Two major dimensions of research around loss functions may be found in the machine learning literature. Some of them are based on classification (softmax cross-entropy loss, center loss [3]), while others achieve representation learning (contrastive loss [4], triplet loss [5,6], circle loss [7], barlow twins [8,9]). However, both types of loss functions suffer from major issues: the triplet loss for representation learning, for instance, exhibits a combinatorial explosion in the number of possible triplets, especially for large-scale datasets, leading to a drastically increased number of training steps.…”
Section: Introductionmentioning
confidence: 99%