2020
DOI: 10.1016/j.asoc.2020.106704
|View full text |Cite
|
Sign up to set email alerts
|

Lightweight speaker verification for online identification of new speakers with short segments

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…The results show that BGCC-PWPE feature and Triplet-DAM with coupled architecture maintains a large performance advantage. At the same time, the experiments comparing with the current best performing techniques of the text-independent speaker recognition and short duration voiceprint recognition like the GE2E Loss [46], 1D-Triplet-CNN-MFCC-LPC [4], and ResNet-34 + SpecdB [47], from the performances on VoxCeleb-2 and NIST SRE 2010 dataset in Table 6, we can see that, with the proposed method of Trip-DAM-model based on BGCC-PWPE acoustic features, the system performance is further improved in the 3-second speech length and full-length speech. scaled Gauss filter method and effectiveness of perceptual wavelet packet entropy with non-stationary speech feature extraction, which is better at extracting reliable voiceprint and high-frequency features of the speaker and learns a high-resolution embedding of information that improves short-duration speaker recognition performance, concurrently, compensating for the weaknesses of discriminable feature sparsity.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The results show that BGCC-PWPE feature and Triplet-DAM with coupled architecture maintains a large performance advantage. At the same time, the experiments comparing with the current best performing techniques of the text-independent speaker recognition and short duration voiceprint recognition like the GE2E Loss [46], 1D-Triplet-CNN-MFCC-LPC [4], and ResNet-34 + SpecdB [47], from the performances on VoxCeleb-2 and NIST SRE 2010 dataset in Table 6, we can see that, with the proposed method of Trip-DAM-model based on BGCC-PWPE acoustic features, the system performance is further improved in the 3-second speech length and full-length speech. scaled Gauss filter method and effectiveness of perceptual wavelet packet entropy with non-stationary speech feature extraction, which is better at extracting reliable voiceprint and high-frequency features of the speaker and learns a high-resolution embedding of information that improves short-duration speaker recognition performance, concurrently, compensating for the weaknesses of discriminable feature sparsity.…”
Section: Resultsmentioning
confidence: 99%
“…In such circumstances, conventional speaker models based on statistical characteristics fall short of accurately describing speakers. Nevertheless, short duration audio signals have always been momentous difficulties of speaker recognition [6,7]. Simultaneously, acoustic feature extraction is a critical component of short duration speech speaker recognition system [8].…”
Section: Introductionmentioning
confidence: 99%