2011
DOI: 10.1109/tce.2011.5735502
|View full text |Cite
|
Sign up to set email alerts
|

Enhanced voice activity detection using acoustic event detection and classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 36 publications
(3 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…For instance, the images that contained face expression of customers require having the whole face unit detected and segmented. On the other hand, the certain time range of signals needs Voice Activity Detector (VAD) to identify the segments of time range that contain the existence of human voice (Cho, 2011). 3.…”
Section: Automated Emotion Recognition Systemmentioning
confidence: 99%
“…For instance, the images that contained face expression of customers require having the whole face unit detected and segmented. On the other hand, the certain time range of signals needs Voice Activity Detector (VAD) to identify the segments of time range that contain the existence of human voice (Cho, 2011). 3.…”
Section: Automated Emotion Recognition Systemmentioning
confidence: 99%
“…Similar work by Namgook and Kim [22] developed voice detection and classification modules integrating a total of 202 minutes of audio containing wood, metal, door slam and speech sounds via Gaussian mixture models (GMM) and support vector machines (SVM). GMM performed well giving an accuracy of 98.72% for "door-slamming + speech" activity compared to 88.3% for the closest presented case of "Door-closing DC + Traffic TF" in the proposed methodology.…”
Section: Benchmarking With Previous Workmentioning
confidence: 99%
“…Block diagram of the classification of non-silent activity segments via an autoregressive neural classifier (NARX) with directional audio as exogenous data. The Summation of Energy features are actively utilized in sound synthesis research as reported by Portelo et al[22] …”
mentioning
confidence: 99%