2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP) 2015
DOI: 10.1109/mlsp.2015.7324385
|View full text |Cite
|
Sign up to set email alerts
|

Joint time-frequency scattering for audio classification

Abstract: We introduce the joint time-frequency scattering transform, a time shift invariant descriptor of time-frequency structure for audio classification. It is obtained by applying a twodimensional wavelet transform in time and log-frequency to a time-frequency wavelet scalogram. We show that this descriptor successfully characterizes complex time-frequency phenomena such as time-varying filters and frequency modulated excitations. State-of-the-art results are achieved for signal reconstruction and phone segment cla… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
140
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 84 publications
(141 citation statements)
references
References 13 publications
1
140
0
Order By: Relevance
“…We use the SPGL1 solver [127] with at most 200 iterations, and 2 := 0.01. The second system is MAPsCAT, which uses features computed with the scattering transform [3]. This produces 40 feature vectors of 469 dimensions for a 30-s excerpt.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We use the SPGL1 solver [127] with at most 200 iterations, and 2 := 0.01. The second system is MAPsCAT, which uses features computed with the scattering transform [3]. This produces 40 feature vectors of 469 dimensions for a 30-s excerpt.…”
Section: Methodsmentioning
confidence: 99%
“…We use the Echo Nest Musical Fingerprinter (ENMFP) 3 to generate a fingerprint of every excerpt in GTZAN and to query the Echo Nest database having over 30,000,000 songs. The second column of Table 1 shows that this identifies only 60.6% of the excerpts.…”
Section: Identifying Excerptsmentioning
confidence: 99%
See 1 more Smart Citation
“…Since its introduction in [8], the scattering transform has found successful applications in, for example, audio genre, visual textures or medical data classification [3,11,12]. …”
Section: The Scattering Transform Of F Ismentioning
confidence: 99%
“…This is similar to MFCC coefficient computation but a scattering-subband filterbank is used The block diagram of the tanh based Scattered Transform Cepstral Coefficients (tanh-STCC) feature extraction algorithm is shown in Figure 2: The amplitude range of recorded sound data is normalized between -1 and 1 [9,13,14] before the filterbank. Pre-emphasis, framing, windowing, logarithm and the DCT block are the same as the ordinary MFCC computation.…”
Section: Tanh Based Scattered Transform Cepstral Coefficients (Tamentioning
confidence: 99%