2019
DOI: 10.1109/access.2019.2901812
|View full text |Cite
|
Sign up to set email alerts
|

Noise Robust Speaker Recognition Based on Adaptive Frame Weighting in GMM for i-Vector Extraction

Abstract: Even though speaker recognition has gained significant progress in recent years, its performance is known to be deteriorated severely with the existence of strong background noises. Inspired by a recently proposed clean-frame selection approach, this work investigates a relatively elegant weighting method when computing the Baum-Welch statistics of Gaussian mixture models (GMMs) in i-vector extraction. By introducing weighting parameters to the frames of enrollment/testing utterances, the optimization problem … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 29 publications
0
7
0
Order By: Relevance
“…The noise reduction method in this study uses adaptive noise-canceling (ANC) with the least mean square (LMS) algorithm [18], [19]. This method has a simple and reliable structure [20], [21]. The structure of the LMS algorithm is shown in Figure 2(a).…”
Section: Adaptive Noise-cancelingmentioning
confidence: 99%
“…The noise reduction method in this study uses adaptive noise-canceling (ANC) with the least mean square (LMS) algorithm [18], [19]. This method has a simple and reliable structure [20], [21]. The structure of the LMS algorithm is shown in Figure 2(a).…”
Section: Adaptive Noise-cancelingmentioning
confidence: 99%
“…From the perspective of the speech recognition model, the application of speech signal can be roughly divided into three categories, including vocal print recognition, speech recognition, and emotion recognition [8]. The classifiers for speech recognition tasks include traditional classifiers and deep learning algorithms, involving HMM, Gaussian Mixture Model (GMM), support vector machine (SVM), and extreme learning machine (ELM) [9][10][11]. At present, the role of acoustic parameters is analyzed in the objective evaluation of artistic vocal, and methods of it are proposed based on error back propagation (BP) and learning vector quantization (LVQ) [12].…”
Section: Introductionmentioning
confidence: 99%
“…Even though TI-SV is more challenging than TD-SV because of the phonetic variability, TI-SV is more convenient from a user point of view in that the user can speak freely to the system. Over the past decades, the i-vector approach [2] with probabilistic linear discriminant analysis (PLDA) [3] has been widely used for TI-SV [4]- [7]. The i-vector approach learns a low-dimensional representation containing both speaker and channel variability, through which a variable-length utterance can be represented as a fixed-dimensional i-vector.…”
Section: Introductionmentioning
confidence: 99%