2013
DOI: 10.5815/ijigsp.2013.09.07
|View full text |Cite
|
Sign up to set email alerts
|

A New Design Approach for Speaker Recognition Using MFCC and VAD

Abstract: This paper presents a new approach for designing a speaker recognition system based on mel frequency cepstral coefficients (MFCCs) and voice activity detector (VAD). VAD has been employed to suppress the background noise and distinguish between silence and voice activity. MFCCs were extracted from the detected voice sample and are compared with the database for recognition of the speaker. A new criteria for detection is proposed which gives very good performance in noisy environment

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 16 publications
0
8
0
Order By: Relevance
“…The Cepstrum is the inverse Fourier transform of the log spectrum. According to Figure 3, the implementation of MFCCs features can be classified into five sections [18,21] namely: 1) Preemphasis 2) Frame blocking and windowing 3) Fast Fourier Transform (FFT) 4) Mel-scaled filter bank and 5) Generate MFCCs features.…”
Section: Mel Frequency Cepstral Coefficients (Mfccs)mentioning
confidence: 99%
See 2 more Smart Citations
“…The Cepstrum is the inverse Fourier transform of the log spectrum. According to Figure 3, the implementation of MFCCs features can be classified into five sections [18,21] namely: 1) Preemphasis 2) Frame blocking and windowing 3) Fast Fourier Transform (FFT) 4) Mel-scaled filter bank and 5) Generate MFCCs features.…”
Section: Mel Frequency Cepstral Coefficients (Mfccs)mentioning
confidence: 99%
“…In the final section, the log for Mel-spectrum is used and transferred back to the time domain to produce the MFCCs features [22]. The reader can refer to [18,21] for further information. The bandwidth and spacing are calculated by a constant interval of Mel-frequency [21] as shown in (1): Figure 2.…”
Section: Mel Frequency Cepstral Coefficients (Mfccs)mentioning
confidence: 99%
See 1 more Smart Citation
“…In auditory cortex, receptive fields are defined as cortical circuits involving small clusters of neurons ordered topographically according to the tuning characteristics of cochlea and the clustered neurons as well as cortical circuit becoming active with specific acoustic features. But, the acoustical features of vowels consist not only specific linguistic information but also perturbed acoustical features which are generated during vowel production due to the differences of vocaltract size, shape and physical conditions of speakers [5,6]. Moreover, the environmental noises also contaminate the acoustical features of vowels.…”
Section: Introductionmentioning
confidence: 99%
“…A pre-emphasis finite impulse response (FIR) filter realizing a first order highpass filter was employed to filter the speech samples with emphasis coefficient 0.96 [5]. In addition, framing and Hamming windowing were employed with a frame length of 16 ms with an inter-frame overlap of 8 ms [33]. Moreover, this work exploits a triangular/Mel filter bank (MFB) and the logarithmic non-linearity used in MFCC [34], as well as the Gammatone filter bank (GFB) and power law non-linearity for PNCC [31,35,36].…”
Section: Feature Extraction and Compensationmentioning
confidence: 99%