2011
DOI: 10.1109/lsp.2010.2100380
|View full text |Cite
|
Sign up to set email alerts
|

Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
115
0
2

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 196 publications
(119 citation statements)
references
References 5 publications
2
115
0
2
Order By: Relevance
“…The performance results are shown in Table 2b. Some other works have also reported the classification performance on the RWCP dataset, but they used the subset of the dataset; e.g., the works of [5,6,17,18] reported around 90% only on 10 ∼ 20 sound categories, and Dennis et al [4] exhibited the performance of 98.1% which is close to our result though their method was evaluated only on 60 categories, half subset of ours. Therefore, we can say that the proposed method achieves the state-of-the-art performance on the whole RWCP dataset.…”
Section: Comparison To the Other Methodssupporting
confidence: 84%
See 2 more Smart Citations
“…The performance results are shown in Table 2b. Some other works have also reported the classification performance on the RWCP dataset, but they used the subset of the dataset; e.g., the works of [5,6,17,18] reported around 90% only on 10 ∼ 20 sound categories, and Dennis et al [4] exhibited the performance of 98.1% which is close to our result though their method was evaluated only on 60 categories, half subset of ours. Therefore, we can say that the proposed method achieves the state-of-the-art performance on the whole RWCP dataset.…”
Section: Comparison To the Other Methodssupporting
confidence: 84%
“…While the methods for classifying speech and music have been intensively developed for decades, those for the environmental sounds are studied with keen attention in recent years [1,2,3,4]. The environmental sounds are different from the speech and music in that the acoustic signals are not stationary nor well-structured; characteristics in these types of sounds are discussed in [5].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Scalograms are considered textures and 2D DWT have been used to analyze these textures in many applications [14] [15]. We follow [16] to decompose a scalogram into 2D DWT image.…”
Section: Scalogram Analysismentioning
confidence: 99%
“…Mel-Frequency Cepstral Coefficients (MFCCs), Zero Crossing Rate (ZCR), Spectral Centroid are some of the most widely used features for audio analysis. Many previous efforts have also been made in classifying audio signals using other features such as MPEG-7 descriptors [1,2] , Linear Prediction coefficients [3] , features derived from statistics of spectrogram image of an audio [4] and Log-Gabor Filters [5] . The bag of phrases approach is introduced in [6] , where a codebook is generated using Gaussian Mixture Model and then the codebook is used to obtain a new set of features for the classification.…”
Section: Introductionmentioning
confidence: 99%