2021
DOI: 10.48550/arxiv.2110.09720
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Rep Works in Speaker Verification

Abstract: Multi-branch convolutional neural network architecture has raised lots of attention in speaker verification since the aggregation of multiple parallel branches can significantly improve performance. However, this design is not efficient enough during the inference time due to the increase of model parameters and extra operations. In this paper, we present a new multi-branch network architecture RepSP-KNet that uses a re-parameterization technique. With this technique, our backbone model contains an efficient V… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 29 publications
0
1
0
Order By: Relevance
“…In the training stage, the model adopts the multi-branch topology to learn the multi-scale speaker's features, while the network uses the re-parameterization approach to convert all branches to a single-path topology in calculation with a high inference speed in the inference stage. The idea of re-parameterization was first proposed in [30] on RepVGG model for image processing, and was later adapted to ASV in [32], [33]. But these models were based on Rep-VGG's structure, where only a limited number of branches (temporal scales) could be integrated.…”
Section: B Multi-branch-based Speaker Embedding Modelsmentioning
confidence: 99%
“…In the training stage, the model adopts the multi-branch topology to learn the multi-scale speaker's features, while the network uses the re-parameterization approach to convert all branches to a single-path topology in calculation with a high inference speed in the inference stage. The idea of re-parameterization was first proposed in [30] on RepVGG model for image processing, and was later adapted to ASV in [32], [33]. But these models were based on Rep-VGG's structure, where only a limited number of branches (temporal scales) could be integrated.…”
Section: B Multi-branch-based Speaker Embedding Modelsmentioning
confidence: 99%