Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2106
|View full text |Cite
|
Sign up to set email alerts
|

Compact Speaker Embedding: lrx-Vector

Abstract: Deep neural networks (DNN) have recently been widely used in speaker recognition systems, achieving state-of-the-art performance on various benchmarks. The x-vector architecture is especially popular in this research community, due to its excellent performance and manageable computational complexity. In this paper, we present the lrx-vector system, which is the low-rank factorized version of the x-vector embedding network. The primary objective of this topology is to further reduce the memory requirement of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 27 publications
0
8
0
Order By: Relevance
“…edge or mobile devices, have not been fully studied. The topic was just recently investigated in [107,102,112].…”
Section: Discussion To the Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…edge or mobile devices, have not been fully studied. The topic was just recently investigated in [107,102,112].…”
Section: Discussion To the Networkmentioning
confidence: 99%
“…The application of F-TDNN to deep embedding was also investigated [89,98,100]. Some other parameter reduction works can be found in [101,102]. Recently, [90] integrated TDNN with statistics pooling at each layer for compensating the variation of temporal context in the frame-level transforms.…”
Section: Deep Embedding: Network Structures and Inputsmentioning
confidence: 99%
“…Mingote et al [40] explored the KD approach and presented a data augmentation technique to improve robustness of TD-SV systems. Georges et al [24] proposed a low-rank factorized version of the x-vector embedding network [5]. To design the lightweight models, Nunes et al [41] proposed a portable model called additive margin MobileNet1D (AM-MobileNet1D) for speaker identification on mobile devices, which uses raw waveform of speeches as input.…”
Section: Lightweight Architectures For Ti-svmentioning
confidence: 99%
“…On the other hand, with the application for access control in mobile devices [24], the design of SV systems tends to be lightweight and efficient. Existing DNN-based SV models, comprising of millions of parameters, require immense computational resources and are hard to achieve the lightweight goal.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, with the widespread use of mobile devices [ 24 , 25 , 26 , 27 ], the design of speaker recognition systems tends to be light and efficient. However, existing models cannot be lighter, and the performance decreases drastically when made lighter.…”
Section: Introductionmentioning
confidence: 99%