2011
DOI: 10.1016/j.specom.2010.08.007
|View full text |Cite
|
Sign up to set email alerts
|

Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features

Abstract: In audio content analysis, the discrimination of speech and non-speech is the first processing step before speaker segmentation and recognition, or speech transcription. Speech/non-speech segmentation algorithms usually consist of a frame based scoring phase using MFCC features, combined with a smoothing phase. In this paper, a content based speech discrimination algorithm is designed to exploit long-term information inherent in modulation spectrum. In order to address the varying degrees of redundancy and dis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
10
0
1

Year Published

2012
2012
2017
2017

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 18 publications
(12 citation statements)
references
References 23 publications
1
10
0
1
Order By: Relevance
“…The HOSVD has been applied in numerous application domains [29], such as image processing [9,32,39,61], pattern recognition [49,50,59,60,62], data mining and machine learning [33,34,52,53], signal processing [12,19,36,37,38,45], psychometrics [54,55,56], chemometrics [5], and biomedicine [16,40,41]. Aside from its use in applications, the HOSVD is also of considerable theoretical importance.…”
mentioning
confidence: 99%
“…The HOSVD has been applied in numerous application domains [29], such as image processing [9,32,39,61], pattern recognition [49,50,59,60,62], data mining and machine learning [33,34,52,53], signal processing [12,19,36,37,38,45], psychometrics [54,55,56], chemometrics [5], and biomedicine [16,40,41]. Aside from its use in applications, the HOSVD is also of considerable theoretical importance.…”
mentioning
confidence: 99%
“…These characteristics are referred in the literature as segment-based features [29,30]. For example, in [31], a content-based speech discrimination algorithm is designed to exploit the long-term information inherent in the modulation spectrum; and in [32], authors propose two segment-based features: the variance of the spectrum flux (VSF) and the variance of the zero crossing rate (VZCR).…”
Section: General Description Of Audio Segmentation Systemsmentioning
confidence: 99%
“…Some of the widely used are: (i) multi-class problem, 10 (ii) binary-classes problem, 37 (iii) hierarchical structure of the classes problem, 11 (iv) two-groups or multi-group of classes problem 28 and (v) detection of a class over the other classes problem. 19,48 In this work, we present a broadcast news sound recognition methodology based on widely known and used audio features. The implemented framework clusters the audio feature space to subspaces, based on data-driven criteria.…”
Section: Introductionmentioning
confidence: 99%