2018
DOI: 10.1016/j.specom.2018.10.004
|View full text |Cite
|
Sign up to set email alerts
|

Deep neural network based i-vector mapping for speaker verification using short utterances

Abstract: Text-independent speaker recognition using short utterances is a highly challenging task due to the large variation and content mismatch between short utterances. I-vector and probabilistic linear discriminant analysis (PLDA) based systems have become the standard in speaker verification applications, but they are less effective with short utterances. In this paper, we first compare two state-of-the-art universal background model (UBM) training methods for i-vector modeling using full-length and short utteranc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 26 publications
(13 citation statements)
references
References 32 publications
0
13
0
Order By: Relevance
“…Though the length of voice recordings in our research are around10s, research on the same interview speech dataset has showed even 10 seconds length can reach ideal classification accuracy [60]. What’s more, short utterance has been proved to be effective in speaker identification [6164]. The consistently higher accuracy in this research also showed short voice recordings can reach ideal predicting accuracy.…”
Section: Discussionmentioning
confidence: 60%
“…Though the length of voice recordings in our research are around10s, research on the same interview speech dataset has showed even 10 seconds length can reach ideal classification accuracy [60]. What’s more, short utterance has been proved to be effective in speaker identification [6164]. The consistently higher accuracy in this research also showed short voice recordings can reach ideal predicting accuracy.…”
Section: Discussionmentioning
confidence: 60%
“…[70] proposed an orthogonal vector pooling strategy to remove unwanted factors. There are also many robust back-ends for speaker verification [293,294,295,296,297].…”
Section: Other Robust Methodsmentioning
confidence: 99%
“…The latest DNN-based speaker embedding approaches have shown promising results for speaker recognition with short utterances [9,30]. Another recent work demonstrates that DNN-based i-vector mapping is useful for speaker recognition with short utterances [31]. Even though the DNN-based methods give good recognition accuracy, they require massive amount of training data, careful selection of network architecture and related tuning parameters.…”
Section: Short Utterance In Speaker Recognitionmentioning
confidence: 99%