2005
DOI: 10.1109/lsp.2005.855551
|View full text |Cite
|
Sign up to set email alerts
|

Statistical voice activity detection using a multiple observation likelihood ratio test

Abstract: Currently, there are technology barriers inhibiting speech processing systems that work in extremely noisy conditions from meeting the demands of modern applications. This letter presents a new voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm defines an optimum likelihood ratio test (LRT) involving multiple and independent observations. The so-defined decision rule reports significant improvements in s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
83
0
1

Year Published

2006
2006
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 153 publications
(85 citation statements)
references
References 12 publications
1
83
0
1
Order By: Relevance
“…It is mainly for the detection efficiency. We apply the multiple observation technique [14], [15], [25] to the DFT and MFCC features. This technique has two advantages.…”
Section: ) Acoustic Features For Vadmentioning
confidence: 99%
See 1 more Smart Citation
“…It is mainly for the detection efficiency. We apply the multiple observation technique [14], [15], [25] to the DFT and MFCC features. This technique has two advantages.…”
Section: ) Acoustic Features For Vadmentioning
confidence: 99%
“…It fuses the likelihood ratio tests of different statistical models in a linear weighted combination way with the weights optimized by the gradient descent algorithm. Yu and Hansen [14] inherited the advantages of the statistical-model-based multiple observation techniques [25], [27], and proposed to fuse the likelihood ratio tests of multiple observations by the discriminative weight training. Inspired by Kang's VAD and Yu's VAD [14], Suh and Kim [18] further proposed to conduct the linear weighted combination of multiple acoustic models and multiple observations together with all weights optimized by a generalized probabilistic descent algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…An improvement over the LRT proposed by Sohn (Sohn et al, 1999) is the multiple observation LRT (MO-LRT) proposed by Ramírez (Ramírez et al, 2005b). The performance of the decision rule was improved by incorporating more observations to the statistical test.…”
Section: Multiple Observation Likelihood Ratio Testmentioning
confidence: 99%
“…The non-speech hit rate (HR0) and the false alarm rate (FAR0= 100-HR1) were determined in each noise condition being the actual speech frames and actual speech pauses determined by handlabeling the database on the close-talking microphone. Figure 9 shows the ROC curves of the MO-LRT VAD (Ramírez et al, 2005b) and other frequently referred algorithms for recordings from the distant microphone in quiet and high noisy conditions. The working points of the G.729, AMR, and AFE VADs are also included.…”
Section: Receiver Operating Characteristics Curvesmentioning
confidence: 99%
“…The classification task is not as trivial as it appears, and most of the VAD algorithms often fail in high noise conditions. During the last decade, numerous researchers have developed different strategies for detecting speech in a noisy signal [6,7,8,9] and have evaluated the influence of the VAD effectiveness on the performance of speech processing systems [10,11,12,13].…”
Section: Introductionmentioning
confidence: 99%