2008
DOI: 10.1109/tmm.2008.922870
|View full text |Cite
|
Sign up to set email alerts
|

A Speech/Music Discriminator of Radio Recordings Based on Dynamic Programming and Bayesian Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
36
0
3

Year Published

2010
2010
2018
2018

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 51 publications
(39 citation statements)
references
References 12 publications
0
36
0
3
Order By: Relevance
“…A related 2015 MIREX task used data from the British Library's music collections for a comparative study on the problem of speech/music segmentation. 13 This particular binary task has formed a research problem for over a decade Pikrakis et al (2008). Leaving the two-class scenario, SeFiRe, a downloadable tool for the segmentation and visualization of field recordings, segments and labels an audio file into five classes: speech, solo and choir singing, instrumental, and bell chiming.…”
Section: Existing Mir Tools: a Case Studymentioning
confidence: 99%
“…A related 2015 MIREX task used data from the British Library's music collections for a comparative study on the problem of speech/music segmentation. 13 This particular binary task has formed a research problem for over a decade Pikrakis et al (2008). Leaving the two-class scenario, SeFiRe, a downloadable tool for the segmentation and visualization of field recordings, segments and labels an audio file into five classes: speech, solo and choir singing, instrumental, and bell chiming.…”
Section: Existing Mir Tools: a Case Studymentioning
confidence: 99%
“…The signal amplitude measured in root mean square (RMS) and zero-crossing (ZC) are used in real time implementation of SMD [3]. The dynamic programming and Bayesian networks with the entropy of the normalized spectral energy are applied for SMD of radio recording in [4]. In [1], thirteen different audio features were used to train different types of multidimensional classifiers, including a Gaussian maximum, a posteriori (MAP) estimator and a nearest neighbor classifier.…”
Section: Introductionmentioning
confidence: 99%
“…Previous work in the area of audio classification has focused mostly in audio event classification [1] and speech/music discrimination [2]- [4]. In this letter we investigate the problem of automatically classifying collections of audio files in three acoustic classes: speech, instrumental music and song (music with singing voice).…”
Section: Introductionmentioning
confidence: 99%
“…Among the different kind of features proposed for speech/music discrimination, it is worth mentioning the well-known Mel-Frequency Cepstral Coefficients (MFCC) [4], [6], Line Spectral Frequencies (LSF) [2], Zero-Crossing Rate (ZCR) and Frame Energy [4], [6] and more specific parameters such as Spectral Centroid, Spectral Flux, Spectral Rolloff [6] or Chroma-Vector based features [4].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation