2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6854363
|View full text |Cite
|
Sign up to set email alerts
|

Deep neural networks for small footprint text-dependent speaker verification

Abstract: In this paper we investigate the use of deep neural networks (DNNs) for a small footprint text-dependent speaker verification task. At development stage, a DNN is trained to classify speakers at the framelevel. During speaker enrollment, the trained DNN is used to extract speaker specific features from the last hidden layer. The average of these speaker features, or d-vector, is taken as the speaker model. At evaluation stage, a d-vector is extracted for each utterance and compared to the enrolled speaker mode… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
617
0
2

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 952 publications
(620 citation statements)
references
References 14 publications
1
617
0
2
Order By: Relevance
“…[8][9][10] Meanwhile, it has deep network structure and nonlinear activation function, which makes all kinds of deep learning models be appropriate for big data model, especially for the ones with higher dimensions and which are nonlinear. However, the number of samples is relatively small in spectral analysis, and direct application of deep learning model may result in over¯tting problem.…”
Section: Introductionmentioning
confidence: 99%
“…[8][9][10] Meanwhile, it has deep network structure and nonlinear activation function, which makes all kinds of deep learning models be appropriate for big data model, especially for the ones with higher dimensions and which are nonlinear. However, the number of samples is relatively small in spectral analysis, and direct application of deep learning model may result in over¯tting problem.…”
Section: Introductionmentioning
confidence: 99%
“…However, recent success of Deep Neural Networks in different areas of speech processing (Hinton et al, 2012;Lopez-Moreno et al, 2014) promise for the near future exciting developments in speaker recognition, as those advanced in Vasilakakis, Cumani, and Laface (2013), and Variani, Lei, McDermott, Lopez-Moreno, and Gonzalez-Dominguez (2014).…”
Section: Factor Analysis and I-vectorsmentioning
confidence: 99%
“…Recently, many deep learning methods have been applied in the speech recognition and speaker verification systems [41,[165][166][167], and published results show that speech processing methods driven by MBD and deep learning can obviously improve the performance of the existing speech recognition and speaker verification system [40,168,169]. In the IoV systems, millions of sensors collect abundant vehicles and environmental noises from engines and streets will significantly reduce the accuracy of speech processing system, while the traditional speech enhancement methods, for example, Wiener filtering [170] and minimum mean-square error estimation (MMSE) [171] which focus on advancing signal noise ratio (SNR), do not take full advantage of a priori distribution of noises around vehicles.…”
Section: Speech Recognition and Verification For The Internet Ofmentioning
confidence: 99%