2015
DOI: 10.1587/transinf.2015edp7138
|View full text |Cite
|
Sign up to set email alerts
|

Robust Voice Activity Detection Algorithm Based on Feature of Frequency Modulation of Harmonics and Its DSP Implementation

Abstract: SUMMARY This paper proposes a voice activity detection (VAD) algorithm based on an energy related feature of the frequency modulation of harmonics. A multi-resolution spectro-temporal analysis framework, which was developed to extract texture features of the audio signal from its Fourier spectrogram, is used to extract frequency modulation features of the speech signal. The proposed algorithm labels the voice active segments of the speech signal by comparing the energy related feature of the frequency modulati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(5 citation statements)
references
References 46 publications
0
5
0
Order By: Relevance
“…When using this practice method, we must use our strengths and avoid weaknesses to maximize the advantages of group practice. Literature [7] not only divides the exercises into three types: "imitation memory, association creation, and task communication," but also points out the characteristics of these three exercises. Literature [8] divides classroom exercises into four categories: "understanding, imitating memory, intellectual development, and communicative."…”
Section: Related Workmentioning
confidence: 99%
“…When using this practice method, we must use our strengths and avoid weaknesses to maximize the advantages of group practice. Literature [7] not only divides the exercises into three types: "imitation memory, association creation, and task communication," but also points out the characteristics of these three exercises. Literature [8] divides classroom exercises into four categories: "understanding, imitating memory, intellectual development, and communicative."…”
Section: Related Workmentioning
confidence: 99%
“…Traditional statistical models, widely used in speech enhancement due to their simplicity, often rely only on the nonstationarity of speech [5]- [7]. Better accuracy can be achieved by integrating models of features like pitch, harmonicity [8], [9], modulation [10], spectral shape [11], [12], etc. [13].…”
Section: Introductionmentioning
confidence: 99%
“…For different audio types, different audio features are used. e literature used two acoustic characteristics of zerocrossing rate and short-term energy to classify voice and music in broadcast signals [13]. e literature first divided the audio signal in the TV into mute, signal with music component and signal without music component by using four audio characteristics of short-term energy, zerocrossing rate, pitch frequency, and spectral peak trajectory [14].…”
Section: Related Workmentioning
confidence: 99%