2008
DOI: 10.1109/icassp.2008.4517545
|View full text |Cite
|
Sign up to set email alerts
|

Cross-correlation of beat-synchronous representations for music similarity

Abstract: Systems to predict human judgments of music similarity directly from the audio have generally been based on the global statistics of spectral feature vectors i.e. collapsing any large-scale temporal structure in the data. Based on our work in identifying alternative ("cover") versions of pieces, we investigate using direct correlation of beat-synchronous representations of music audio to find segments that are similar not only in feature statistics, but in the relative positioning of those features in tempo-no… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2009
2009
2015
2015

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(25 citation statements)
references
References 6 publications
0
25
0
Order By: Relevance
“…These notes can be unique for each time slot (a melody) or can be played jointly with others (chord or harmonic progressions). From a MIR point of view, clear evidence about the importance of tonal sequences for music similarity and retrieval exists [9,22,34]. In fact, almost all cover song identification algorithms exploit tonal sequence representations extracted from the raw audio signals: they either estimate the main melody, the chord sequence, or the harmonic progression.…”
Section: Feature Extractionmentioning
confidence: 99%
“…These notes can be unique for each time slot (a melody) or can be played jointly with others (chord or harmonic progressions). From a MIR point of view, clear evidence about the importance of tonal sequences for music similarity and retrieval exists [9,22,34]. In fact, almost all cover song identification algorithms exploit tonal sequence representations extracted from the raw audio signals: they either estimate the main melody, the chord sequence, or the harmonic progression.…”
Section: Feature Extractionmentioning
confidence: 99%
“…Within the MSP frameworks of a music matching system [54] and an MPEG-7 metadata calculation from audio streams, it was demonstrated that this leads to virtually no effect on the applications' precision [6]. Similarly, other experiments [7] show that the peak performance of the generic matrix multiplication (GEMM) routine of BLAS achieved on a multicore processor can be increased by up to 92% in comparison to the state-of-the-art double-precision GEMM routine of the Goto library [68].…”
Section: Summary Of Resultsmentioning
confidence: 99%
“…Application examples utilizing such kernels are: document clustering [53], multimedia retrieval engines (such as images/video/music/forensic-indices/metadata-based retrieval [54]), webpage ranking systems [13], etc. Given the prevalence of such systems, significant emphasis has been placed on their efficient parallelization of their computationally-intensive elements (kernels) in computer clusters or GPUs [55].…”
Section: B Information Indexing and Multimedia Retrievalmentioning
confidence: 99%
“…and aims for minimum mean-squared error (MSE) or maximum learning, recognition or matching rate against ground-truth or training data, rather than performance bounds for individual inputs. Examples of such error-tolerant (ET) systems include: lossy image/video/audio compression [2], [3], computer graphics [4], [5], webpage indexing and retrieval [6], object and face recognition in video [7], [8], image/video/music matching [9]- [12], etc. For instance, all face recognition and webpage ranking algorithms optimize for the expected recall percentage against ground-truth results and not for the worst-case.…”
Section: Introductionmentioning
confidence: 99%