Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2437
|View full text |Cite
|
Sign up to set email alerts
|

Language Recognition Using Triplet Neural Networks

Abstract: In this paper, we propose a novel neural network back-end approach based on triplets for the language recognition task, due to its success application in the related field of text-dependent speaker verification. A triplet is a training example constructed of three audio samples; two from the same class and one from a different class. In presenting two pairs of samples to the network, the triplet neural network learns to discriminate between samples from the same languages and pairs of different languages. Trip… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 26 publications
0
10
0
Order By: Relevance
“…The loss must optimise the network by learning these interactions between classes to generalise better to the instances where there is a tiny threshold between the classes. A loss that loosely fits this description is the Triplet loss function used in [11], [12] and [13] for language identification tasks. By using the Triplet loss function, the weights are being optimised by comparing different class embeddings with one another and optimising the distance between the embeddings such that different classes are far from one another.…”
Section: Triplet Entropy Lossmentioning
confidence: 99%
See 3 more Smart Citations
“…The loss must optimise the network by learning these interactions between classes to generalise better to the instances where there is a tiny threshold between the classes. A loss that loosely fits this description is the Triplet loss function used in [11], [12] and [13] for language identification tasks. By using the Triplet loss function, the weights are being optimised by comparing different class embeddings with one another and optimising the distance between the embeddings such that different classes are far from one another.…”
Section: Triplet Entropy Lossmentioning
confidence: 99%
“…Similar methods were applied in [9] and [10]. The researchers in [11], [12] and [13] looked at using Triplet loss on various speech tasks such as LID and user identification. [13] specifically looked at implementing a LID system based on triplet networks.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Metric learning loss functions have been appplied to speaker recognition x-vector DNNs: triplet loss [16,17], prototypical networks [16], PLDA-like similarity [18]. For language recognition, the triplet loss has been used to train the backend classifier [19] and cosine similarity has been used during training of an LSTM-based language embedding extractor [20]. The superiority of metric learning over domain adaptation approaches is that it does not rely on the definition of a source and a target domain and can reduce mismatch between a priori unkwnown domains.…”
Section: Introductionmentioning
confidence: 99%