2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012
DOI: 10.1109/icassp.2012.6287926
|View full text |Cite
|
Sign up to set email alerts
|

Spectrogram based features selection using multiple kernel learning for speech/music discrimination

Abstract: This paper presents a multiple kernel learning (MKL) approach to speech/music discrimination (SMD). The timefrequency representation (spectrogram) implemented by short-time Fourier transform (STFT) of audio segment is decomposed by wavelet packet transform into different subband levels. The subbands, which contain rich texture information, are used as features for this discrimination problem. MKL technique is used to select the optimal subbands to discriminate the audio signals. The proposed MKL based algorith… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(2 citation statements)
references
References 8 publications
0
2
0
Order By: Relevance
“…Where ᴪ is the STFT operator, X(τ,k) is the STFT of the signal x(n), w(n) is the window and S(τ,k) is the spectrogram [13], [14].…”
Section: Reverberation Phenomenamentioning
confidence: 99%
“…Where ᴪ is the STFT operator, X(τ,k) is the STFT of the signal x(n), w(n) is the window and S(τ,k) is the spectrogram [13], [14].…”
Section: Reverberation Phenomenamentioning
confidence: 99%
“…Nilufar et al [54] use wavelet packet decomposition [55], an extension of wavelet transform that includes more signal filters, for robust speech and music discrimination. This technique is applied to the spectrogram to transform it into different subbands containing texture information.…”
Section: Other Time-frequency Representationsmentioning
confidence: 99%