2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019
DOI: 10.1109/apsipaasc47483.2019.9023301
|View full text |Cite
|
Sign up to set email alerts
|

Triplet-Center Loss Based Deep Embedding Learning Method for Speaker Verification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…Center loss 39 only used to reduce the intra‐class differences, cannot effectively increase the inter‐class differences. In recent years, the idea of combing triplet loss and center loss (named triplet center loss) to solve 3D visual analysis 40 and speaker verification 41 shows high accuracy compared with other loss. However, our method has lower time complexity compared with the existing methods.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…Center loss 39 only used to reduce the intra‐class differences, cannot effectively increase the inter‐class differences. In recent years, the idea of combing triplet loss and center loss (named triplet center loss) to solve 3D visual analysis 40 and speaker verification 41 shows high accuracy compared with other loss. However, our method has lower time complexity compared with the existing methods.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…Different network structures have been used for framelevel extraction in recent studies, including the time-delay neural network (TDNN) [2], convolutional neural network (CNN) [5,8] and Long Short-Term Memory (LSTM) network [9]. Meanwhile many authors [5,10,11,12] have investigated residual networks (ResNet) [13] with different settings (i.e. width and depth) as a backbone for embedding learning, which can effectively incorporate the information from previous layers with skip connection.…”
Section: Introductionmentioning
confidence: 99%
“…Many recent works have focused on utterance-level aggregation methods, e.g., statistical pooling [3], attentive pooling [8,9], bilinear pooling [4], and dictionary based pooling methods [10,11]. Meanwhile, other works have proposed different loss functions, including triplet loss [12,13], center loss [10], triplet-center loss [14], angular softmax loss (A-Softmax) [10] and additive margin softmax loss (AM-Softmax) [15,16]. However, in most deep embedding learning methods, the network architectures are trained under identification supervision, optimized for the SID task.…”
Section: Introductionmentioning
confidence: 99%