2019
DOI: 10.1016/j.csl.2019.06.004
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition

Abstract: In this work, we present an analysis of a DNN-based autoencoder for speech enhancement, dereverberation and denoising. The target application is a robust speaker verification (SV) system. We start our approach by carefully designing a data augmentation process to cover wide range of acoustic conditions and obtain rich training data for various components of our SV system. We augment several well-known databases used in SV with artificially noised and reverberated data and we use them to train a denoising autoe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
19
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 43 publications
(22 citation statements)
references
References 36 publications
1
19
0
2
Order By: Relevance
“…In this paper, DCF is the average of two minimum DCF scores when Ptarget, a priori probability of the specified target speaker, is 0.01 and 0.001. Performance comparison is done with DAE-based speech enhancement [8,10,9]. We use an 8-layer time-delay neural network (TDNN) [16] with 1000 hidden units per layer for enhancement.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this paper, DCF is the average of two minimum DCF scores when Ptarget, a priori probability of the specified target speaker, is 0.01 and 0.001. Performance comparison is done with DAE-based speech enhancement [8,10,9]. We use an 8-layer time-delay neural network (TDNN) [16] with 1000 hidden units per layer for enhancement.…”
Section: Methodsmentioning
confidence: 99%
“…For these reasons, only a few studies have explored speech enhancement for speaker verification [7,6] and most recent studies are based on the i-vector approach [8,9,10]. They used a Denoising Autoencoder (DAE) to generate an enhanced signal from the noisy signal.…”
Section: Introductionmentioning
confidence: 99%
“…The proposed LSTM with 20% to 30% relative improvement of EER, has given better results in comparison to other methods used in this research. In [6] a denoising autoencoder is used for joint compensation of additive noise and reverberation in the xvector framework. In this research, the DAE reconstructs the clean version of the magnitude spectrum.…”
Section: Related Workmentioning
confidence: 99%
“…Approaches based on speech signal enhancement [6] and denoising techniques at the speaker modeling level are often proposed to reduce the impact of noise and reverberation on speaker recognition systems [7].…”
Section: Introductionmentioning
confidence: 99%
“…Denoising autoencoder is another DNN method used as a preprocessing step in speaker recognition. In [10] a deep feedforward autoencoder trained in order to transform the noisy log magnitude spectrum to a clean compact encoded signal in the same domain. In another paper, a DNN architecture that uses the feedback from the speaker verification (SV) system to generate a ratio mask was designed.…”
Section: Related Workmentioning
confidence: 99%