Temporal audio segmentation using MPEG-7 descriptors

Wellhausen, Jens; Crysandt, Holger

doi:10.1117/12.476256

Cited by 4 publications

(8 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They thus adopt the following basic strategy: detect similar sections that repeat within a musical piece (such as a repeating phrase) and output those that appear most often. On entering the 2000s, this strategy has led to methods for extracting a single segment from several chorus sections by detecting a repeated section of a designated length as the most representative part of a musical piece [417], [27], [103]; methods for segmenting music, discovering repeated structures, or summarizing a musical piece through bottom-up analyses without assuming the output segment length [110], [111], [512], [516], [23], [195], [104], [82], [664], [420]; and a method for exhaustively detecting all chorus sections by determining the start and end points of every chorus section [224].…”

Section: ^Nset-time Vectormentioning

confidence: 99%

“…The following acoustic features, which capture pitch and timbral features of audio signals in different ways, were used in various methods: chroma vectors [224], [27], [110], [111], mel-frequency cepstral coefficients (MFCC) [417], [103], [23], [195], [104], (dimension-reduced) spectral coefficients [103], [195], [104], [82], [664], pitch representations using FO estimation or constant-Q filterbanks [110], [111], [82], [420], and dynamic features obtained by supervised learning [512], [516]. …”

Section: Extracting Acoustic Features and Calculating Their Similaritymentioning

confidence: 99%

See 1 more Smart Citation

Signal Processing Methods for Music Transcription

Klapuri¹,

Davy²

2006

257

View full text Add to dashboard Cite

Section: ^Nset-time Vectormentioning

confidence: 99%

Section: Extracting Acoustic Features and Calculating Their Similaritymentioning

confidence: 99%

Signal Processing Methods for Music Transcription

Klapuri¹,

Davy²

2006

257

View full text Add to dashboard Cite

“…Music segmentation or structure discovery methods [6]- [13] where the output segment length is not assumed have also been studied. Dannenberg and Hu [6], [7] developed a structure discovery method of clustering pairs of similar segments obtained by several techniques such as efficient dynamic programming or iterative greedy algorithms.…”

Section: Related Workmentioning

confidence: 99%

“…Chai and Vercoe [12] developed a method of detecting segment repetitions by using dynamic programming, clustering the obtained segments, and labeling the segments based on heuristic rules such as the rule of first labeling the most frequent segments, removing them, and repeating the labeling process. Wellhausen and Crysandt [13] studied the similarity matrix of spectral-envelope features defined in the MPEG-7 descriptors and a technique of detecting noncentral diagonal line segments.…”

Section: Related Workmentioning

confidence: 99%

A chorus section detection method for musical audio signals and its application to a music listening station

Goto

2006

IEEE Trans. Audio Speech Lang. Process.

101

View full text Add to dashboard Cite

Abstract-This paper describes a method for obtaining a list of repeated chorus ("hook") sections in compact-disc recordings of popular music. The detection of chorus sections is essential for the computational modeling of music understanding and is useful in various applications, such as automatic chorus-preview/search functions in music listening stations, music browsers, or music retrieval systems. Most previous methods detected as a chorus a repeated section of a given length and had difficulty identifying both ends of a chorus section and dealing with modulations (key changes). By analyzing relationships between various repeated sections, our method, called RefraiD, can detect all the chorus sections in a song and estimate both ends of each section. It can also detect modulated chorus sections by introducing a perceptually motivated acoustic feature and a similarity that enable detection of a repeated chorus section even after modulation. Experimental results with a popular music database showed that this method correctly detected the chorus sections in 80 of 100 songs. This paper also describes an application of our method, a new music-playback interface for trial listening called SmartMusicKIOSK, which enables a listener to directly jump to and listen to the chorus section while viewing a graphical overview of the entire song structure. The results of implementing this application have demonstrated its usefulness.

show abstract

“…The proposed approach is based on the work in [10], and it is extended to improve the recognition accuracy by altering the underlying feature set, utilizing the k-means algorithm to cluster the detected chorus sections, and performing additional post processing techniques. These steps are explained in the following subsections.…”

Section: Structural Similarity Analysismentioning

confidence: 99%

Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

Sert

Baykal

Yazıcı

2006

2006 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

We present a novel algorithm for structural analysis of audio to detect repetitive patterns that are suitable for content-based audio information retrieval systems, since repetitive patterns can provide valuable information about the content of audio, such as a chorus or a concept. The Audio Spectrum Flatness (ASF) feature of the MPEG-7 standard, although not having been considered as much as other feature types, has been utilized and evaluated as the underlying feature set. Expressive summaries are chosen as the longest patterns by the k-means clustering algorithm. Proposed approach is evaluated on a test bed consisting of popular song and speech clips based on the ASF feature. The well known Mel Frequency Cepstral Coefficients (MFCCs) are also considered in the experiments for the evaluation of features. Experiments show that, all the repetitive patterns and their locations are obtained with the accuracy of 93% and 78% for music and speech, respectively.

show abstract

Temporal audio segmentation using MPEG-7 descriptors

Cited by 4 publications

References 3 publications

Signal Processing Methods for Music Transcription

Signal Processing Methods for Music Transcription

A chorus section detection method for musical audio signals and its application to a music listening station

Generating Expressive Summaries for Speech and Musical Audio using Self-Similarity Clues

Contact Info

Product

Resources

About