2020
DOI: 10.11591/ijeecs.v18.i2.pp782-789
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of feature extraction and normalization methods for speaker recognition using grid-audiovisual database

Abstract: <p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…Speaker recognition attempts to figure out which speaker produced a discourse signal though speaker check affirms if the part of the discourse has a place with the person who allegation it. It ought to be noticed that there are two sorts of speaker recognition, which are; text independent and text dependent [14]. This paper will anyway concentrate on text dependent speaker recognition.…”
Section: Introductionmentioning
confidence: 99%
“…Speaker recognition attempts to figure out which speaker produced a discourse signal though speaker check affirms if the part of the discourse has a place with the person who allegation it. It ought to be noticed that there are two sorts of speaker recognition, which are; text independent and text dependent [14]. This paper will anyway concentrate on text dependent speaker recognition.…”
Section: Introductionmentioning
confidence: 99%
“…In this case, various optical, electronic and radio engineering systems with program control are used. The most famous systems of machine and computer vision with digital image processing [1]- [3]. However, in these papers are set forth general technologies for recognizing objects without solving the problem of sorting minerals by any signs.…”
Section: Introductionmentioning
confidence: 99%
“…The mel frequency cepstral coefficients (MFCC) method is a reliable method with high accuracy for high-quality audio recordings [13]. The accuracy rate is more than 90% [14], [15]. The high accuracy of the MFCC method is due to the mel scale which has characteristics similar to human hearing [11].…”
Section: Introductionmentioning
confidence: 99%
“…The characteristics of the mel scale on channels 1 and 2 follow the principle that not all follow a linear pattern, so that each channel follows a linear and exponential pattern. The frequency is then on the mel scale with (15). The width of the mel scale plane of channel 1 and channel 2 is shown in Figure 5.…”
mentioning
confidence: 99%