2003
DOI: 10.1117/12.476256
|View full text |Cite
|
Sign up to set email alerts
|

Temporal audio segmentation using MPEG-7 descriptors

Abstract: In this paper we present an audio segmentation technique by searching similar sections of a song. The search is performed on MPEG-7 low-level audio feature descriptors as a growing source of multimedia meta data. These descriptors are available every 10 ms of audio data. For each block the similarity to each other block is determined. The result of this operation is a matrix which contains off-diagonal stripes representing similar regions. At that point some postprocessing is necessary due to a very disturbed … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2006
2006
2006
2006

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 3 publications
0
8
0
Order By: Relevance
“…They thus adopt the following basic strategy: detect similar sections that repeat within a musical piece (such as a repeating phrase) and output those that appear most often. On entering the 2000s, this strategy has led to methods for extracting a single segment from several chorus sections by detecting a repeated section of a designated length as the most representative part of a musical piece [417], [27], [103]; methods for segmenting music, discovering repeated structures, or summarizing a musical piece through bottom-up analyses without assuming the output segment length [110], [111], [512], [516], [23], [195], [104], [82], [664], [420]; and a method for exhaustively detecting all chorus sections by determining the start and end points of every chorus section [224].…”
Section: ^Nset-time Vectormentioning
confidence: 99%
See 1 more Smart Citation
“…They thus adopt the following basic strategy: detect similar sections that repeat within a musical piece (such as a repeating phrase) and output those that appear most often. On entering the 2000s, this strategy has led to methods for extracting a single segment from several chorus sections by detecting a repeated section of a designated length as the most representative part of a musical piece [417], [27], [103]; methods for segmenting music, discovering repeated structures, or summarizing a musical piece through bottom-up analyses without assuming the output segment length [110], [111], [512], [516], [23], [195], [104], [82], [664], [420]; and a method for exhaustively detecting all chorus sections by determining the start and end points of every chorus section [224].…”
Section: ^Nset-time Vectormentioning
confidence: 99%
“…The following acoustic features, which capture pitch and timbral features of audio signals in different ways, were used in various methods: chroma vectors [224], [27], [110], [111], mel-frequency cepstral coefficients (MFCC) [417], [103], [23], [195], [104], (dimension-reduced) spectral coefficients [103], [195], [104], [82], [664], pitch representations using FO estimation or constant-Q filterbanks [110], [111], [82], [420], and dynamic features obtained by supervised learning [512], [516]. …”
Section: Extracting Acoustic Features and Calculating Their Similaritymentioning
confidence: 99%
“…Music segmentation or structure discovery methods [6]- [13] where the output segment length is not assumed have also been studied. Dannenberg and Hu [6], [7] developed a structure discovery method of clustering pairs of similar segments obtained by several techniques such as efficient dynamic programming or iterative greedy algorithms.…”
Section: Related Workmentioning
confidence: 99%
“…Chai and Vercoe [12] developed a method of detecting segment repetitions by using dynamic programming, clustering the obtained segments, and labeling the segments based on heuristic rules such as the rule of first labeling the most frequent segments, removing them, and repeating the labeling process. Wellhausen and Crysandt [13] studied the similarity matrix of spectral-envelope features defined in the MPEG-7 descriptors and a technique of detecting noncentral diagonal line segments.…”
Section: Related Workmentioning
confidence: 99%
“…The proposed approach is based on the work in [10], and it is extended to improve the recognition accuracy by altering the underlying feature set, utilizing the k-means algorithm to cluster the detected chorus sections, and performing additional post processing techniques. These steps are explained in the following subsections.…”
Section: Structural Similarity Analysismentioning
confidence: 99%