2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010
DOI: 10.1109/icassp.2010.5495219
|View full text |Cite
|
Sign up to set email alerts
|

Cyclic tempogram—A mid-level tempo representation for musicsignals

Abstract: The extraction of local tempo and beat information from audio recordings constitutes a challenging task, particularly for music that reveals significant tempo variations. Furthermore, the existence of various pulse levels such as measure, tactus, and tatum often makes the determination of absolute tempo problematic. In this paper, we present a robust mid-level representation that encodes local tempo information. Similar to the well-known concept of cyclic chroma features, where pitches differing by octaves are… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
58
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 62 publications
(58 citation statements)
references
References 10 publications
0
58
0
Order By: Relevance
“…The peaks of the resulting beat spectrum B represent the strength of BPM values in the signal [18]. But they do not take into account harmonics, i.e., the fact that a 30 BPM peak usually implies a 60 BPM peak [19,20]. Therefore we derive an enhanced beat spectrum BE, which boosts frequencies that are supported by certain harmonics:…”
Section: Estimating the Dominant Pulsementioning
confidence: 98%
“…The peaks of the resulting beat spectrum B represent the strength of BPM values in the signal [18]. But they do not take into account harmonics, i.e., the fact that a 30 BPM peak usually implies a 60 BPM peak [19,20]. Therefore we derive an enhanced beat spectrum BE, which boosts frequencies that are supported by certain harmonics:…”
Section: Estimating the Dominant Pulsementioning
confidence: 98%
“…One of the issues facing tempo induction is that algorithms frequently make octave errors, i.e. they estimate the tempo to be either half or twice (a third or thrice) the actual ground truth tempo [13,14]. It is frequently the case that this ambiguity also exists with human tapping; selecting the right tempo between two integer-related candidates is challenging.…”
Section: Introductionmentioning
confidence: 99%
“…Instead of extracting tempo and beat information explicitly, various spectrogram-like representations have been proposed for visualizing tempo-related information over time. Such mid-level representations include tempograms [13,14,15], rhythmograms [16], or beat spectrograms [4,17]. Cyclic versions of time-tempo representations, which possess a high degree of robustness to pulse level switches, have been introduced in [15,17].…”
Section: Introductionmentioning
confidence: 99%