2011 4th International Conference on Mechatronics (ICOM) 2011
DOI: 10.1109/icom.2011.5937134
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the effect of voice activity detection in isolated Yoruba word recognition system

Abstract: This paper discusses and evaluates the effect of voice Activity Detection (VAD) in an isolated Yoruba word recognition system (IYWRS). The word database used in this paper are collected from 22 speakers by repeating the numbers 1 to 9 three times each. A hybrid configuration of Mel-Frequency Cepstral coefficient (MFCC) and Linear Predictive Coding (LPC) have been used to extract the features of the speech samples. Artificial Neural Network algorithms are then used to classify these features. An overall accurac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…Based on the weaknesses and strengths of the two methods, both feature extraction using MFCC and or LPC, the researchers prefer feature extraction using MFCC because the level of accuracy is better than LPC [1][7] [8]. MFCC feature extraction was between 58-75% [7]. In addition, the LPC method, research by [9] is more suitable for linear computations, whereas the human voice is essentially non-linear.…”
Section: A Feature Extraction Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Based on the weaknesses and strengths of the two methods, both feature extraction using MFCC and or LPC, the researchers prefer feature extraction using MFCC because the level of accuracy is better than LPC [1][7] [8]. MFCC feature extraction was between 58-75% [7]. In addition, the LPC method, research by [9] is more suitable for linear computations, whereas the human voice is essentially non-linear.…”
Section: A Feature Extraction Methodsmentioning
confidence: 99%
“…According to [22], NN has a weakness in the training process, which requires a long time with a large amount of data. The same statement by [7] to identify the number one to nine utterances has a problem when the training process with massive data requires a very long processing time.…”
Section: B Matching Speech Recognitionmentioning
confidence: 99%
“…So far, feature extraction of the most widely used speech signals is Mel-Frequency Ceptrum Coefficients (MFCC) in both speaker recognition and speech recognition by Davis and Mermelstein [8]. Feature extraction with MFCC according to Aibinu et al [9] has a level of accuracy of 58%. The recognition accuracy 75 % for MFCC by Hidayat [10].…”
Section: Literature Reviewmentioning
confidence: 99%
“…That of [20] reveals that while ANN performs better with training data and FL is better with test data with ANN models performs better than FL on the overall. Reference [26] experimented on the effect of Voice Activity Detection (VAD) on SY ASR. Hybrid of MFCC and Linear Predictive Coding (LPC) was used for feature extraction while ANN was used in recognition stage.…”
Section: Research Progress In Sy Asrmentioning
confidence: 99%