2012
DOI: 10.1186/1687-4722-2012-1
|View full text |Cite
|
Sign up to set email alerts
|

A novel voice activity detection based on phoneme recognition using statistical model

Abstract: In this article, a novel voice activity detection (VAD) approach based on phoneme recognition using Gaussian Mixture Model based Hidden Markov Model (HMM/GMM) is proposed. Some sophisticated speech features such as high order statistics (HOS), harmonic structure information and Mel-frequency cepstral coefficients (MFCCs) are employed to represent each speech/non-speech segment. The main idea of this new method is regarding the non-speech as a new phoneme corresponding to the conventional phonemes in mandarin, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…Recently, many advanced techniques were proposed to eliminate the limitation of MFCC. The harmonic structure-related information is robust to high-pitched sounds and seems to be a promising cue to increase robustness of the noise, especially in low SNR conditions [ 13 ]. Fukuda et al [ 8 ] replaced the traditional Mel-frequency cepstral coefficients by the harmonic structure information and made a significant improvement of recognition rate.…”
Section: Enhanced Voice Activity Detection Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, many advanced techniques were proposed to eliminate the limitation of MFCC. The harmonic structure-related information is robust to high-pitched sounds and seems to be a promising cue to increase robustness of the noise, especially in low SNR conditions [ 13 ]. Fukuda et al [ 8 ] replaced the traditional Mel-frequency cepstral coefficients by the harmonic structure information and made a significant improvement of recognition rate.…”
Section: Enhanced Voice Activity Detection Algorithmmentioning
confidence: 99%
“…Wu and Zhang [ 12 ] proposed a multiple kernel support vector machine (MK-SVM) method for multiple feature based VAD. Bao and Zhu [ 13 ] combined harmonic structure information and the high order statistics (HOS) with Gaussian mixture model based hidden Markov model (HMM/GMM) for efficient speech/nonspeech classification.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, choosing a suitable data fitting model is of great significance to the subsequent diagnosis results. The common method of combining data fitting with HMM is Gaussian mixture model (GMM) [17], which is commonly used in the field of speech recognition [22][23][24][25]. The observation probability matrix in the HMM parameters can be described by the GMM of the observation sequence, and the corresponding recognition model can be obtained.…”
Section: Introductionmentioning
confidence: 99%
“…However, the algorithms based on speech features with heuristic rules have difficulty in coping with real world noises at low SNR conditions. Recently, statistical model based VAD is found to be an efficient approach to segregate speech and non-speech frames under a broad range of background noises [11], [12], [13], [14], [15], [16]. In [11], a robust VAD algorithm based on statistical likelihood ratio test (LRT) involving a single observation vector is proposed.…”
Section: Introductionmentioning
confidence: 99%