Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-1197
|View full text |Cite
|
Sign up to set email alerts
|

A Speaker Recognition System for the SITW Challenge

Abstract: This paper presents an ITMO university system submitted to the Speakers in the Wild (SITW) Speaker Recognition Challenge. During evaluation track of the SITW challenge we explored conventional universal background model (UBM) Gaussian mixture model (GMM) i-vector systems and recently developed DNN-posteriors based i-vector systems. The systems were investigated under the real-world media channel conditions represented in the challenge. This paper discusses practical issues of the robust i-vector systems traini… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…Note that when trained with augmented data, the DNNbased speaker embedding systems significantly outperform our previous i-vector-based systems on SITW protocol [37].…”
Section: Resultsmentioning
confidence: 81%
“…Note that when trained with augmented data, the DNNbased speaker embedding systems significantly outperform our previous i-vector-based systems on SITW protocol [37].…”
Section: Resultsmentioning
confidence: 81%
“…IDV and DICN approaches are applied on the out-of-domain ivector set and they are indicated as IDV-SWB and DICN-SWB respectively. For comparison with prior Autoencoderbased method, we developed DAE using out-of-domain dataset which has speaker label with 1300 nodes of single hidden layer as Kudashev's approach [23] and examined as shown in system 8.…”
Section: Performance Comparison To State-of-the-art Techniquesmentioning
confidence: 99%
“…Autoencoder based domain adaptation is widely used in machine learning community [17]- [19] and also adopted already on speech processing area [20]- [22]. Recently Kudashev [23] proposed a DAE-based denoising and domain adaptation for speaker recognition.…”
Section: Introductionmentioning
confidence: 99%
“…The performance of speaker verification degrades significantly when the speech is corrupted by interference speakers. Speaker diarization can be useful for speaker verification with nonoverlapping multi-talker speech [1][2][3][4][5][6]. It can effectively exclude unwanted speech segments when the speakers only slightly overlap [7,8].…”
Section: Introductionmentioning
confidence: 99%