2019 9th International Conference on Advances in Computing and Communication (ICACC) 2019
DOI: 10.1109/icacc48162.2019.8986182
|View full text |Cite
|
Sign up to set email alerts
|

Performance Comparison of Multiple Speech Features for Speaker Recognition using Artifical Neural Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
8
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 11 publications
0
8
0
Order By: Relevance
“…The modeling and decision making in speaker recognition also vary, starting from the widely used HMM [4,16], vector quantization (VQ) [9], SVM [14], the classical technique Gaussian Mixture Model (GMM) [10,12,13], and ANN [17,18]. However, from the decision-making model, HMM is more suitable for modeling speaker features, especially in text-dependent speaker recognition systems, where there are two types of HMM, Ergodic and left-right [19].…”
Section: Introductionmentioning
confidence: 99%
“…The modeling and decision making in speaker recognition also vary, starting from the widely used HMM [4,16], vector quantization (VQ) [9], SVM [14], the classical technique Gaussian Mixture Model (GMM) [10,12,13], and ANN [17,18]. However, from the decision-making model, HMM is more suitable for modeling speaker features, especially in text-dependent speaker recognition systems, where there are two types of HMM, Ergodic and left-right [19].…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, with the progress of computer technology, machine learning technology has also been rapidly developed. Neural networks as the main approach of machine learning techniques have been extensively studied and applied [17][18][19]. These machine learning methods perform well in dealing with optimisation problems.…”
Section: Introductionmentioning
confidence: 99%
“…Related to this field is the blind source separation or the cocktail party effect wherein a specific speaker's voice can be extracted from a group of noisy or unwanted environment [10]. Neural networks can also be trained to model speaker features as what was done in [11] and other applications [11].…”
Section: Introductionmentioning
confidence: 99%
“…In speaker modeling, background speaker and universal background modeling methods can be utilized. In this work, we adapt the constrained maximum likelihood linear regression (CMLLR) [11] and support vector machines (SVM) methods for recognizing a person's voice. Gaussian mixture models (GMM) models the speaker and its background.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation