2009
DOI: 10.1007/s11042-009-0360-2
|View full text |Cite
|
Sign up to set email alerts
|

Indexing music by mood: design and integration of an automatic content-based annotator

Abstract: In the context of content analysis for indexing and retrieval, a method for creating automatic music mood annotation is presented. The method is based on results from psychological studies and framed into a supervised learning approach using musical features automatically extracted from the raw audio signal. We present here some of the most relevant audio features to solve this problem. A ground truth, used for training, is created using both social network information systems (wisdom of crowds) and individual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
22
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 37 publications
(25 citation statements)
references
References 30 publications
1
22
0
Order By: Relevance
“…A common approach in these studies has been to extract a range of features, often low-level ones such as timbre, dynamics, articulation, Mel-frequency cepstral coefficients (MFCC) and subject them to further analysis. The parameters of the actual feature extraction are dependent on the goals of the particular study; some focus on shorter musical elements, particularly the MFCC and its derivatives [21,39,40]; while others utilize more high-level concepts, such as harmonic progression [41][42][43]. In this study, the aim was to characterize the semantic structures with a combined set of non-redundant, robust low-level acoustic and musical features suitable for this particular set of data.…”
Section: Determining the Acoustic Qualities Of Each Clustermentioning
confidence: 99%
“…A common approach in these studies has been to extract a range of features, often low-level ones such as timbre, dynamics, articulation, Mel-frequency cepstral coefficients (MFCC) and subject them to further analysis. The parameters of the actual feature extraction are dependent on the goals of the particular study; some focus on shorter musical elements, particularly the MFCC and its derivatives [21,39,40]; while others utilize more high-level concepts, such as harmonic progression [41][42][43]. In this study, the aim was to characterize the semantic structures with a combined set of non-redundant, robust low-level acoustic and musical features suitable for this particular set of data.…”
Section: Determining the Acoustic Qualities Of Each Clustermentioning
confidence: 99%
“…In contrast, research on computing high-level semantic features from low-level audio descriptors exists. In particular, in the context of MIR classification problems, genre classification [34], mood detection [35], [36], and artist identification [24] have gathered much research attention.…”
Section: A Music Similaritymentioning
confidence: 99%
“…Descriptor class Timbral Bark bands [35], [37] MFCCs [13], [35], [37], [38] Pitch [39], pitch centroid [40] Spectral centroid, spread, kurtosis, rolloff, decrease, skewness [35], [37], [41] High-frequency content [39], [41] Spectral complexity [35] Spectral crest, flatness, flux [37], [41] Spectral energy, energy bands, strong peak, tristimulus [41] Inharmonicity, odd to even harmonic energy ratio [37] Rhythmic BPM, onset rate [35], [39], [41] Beats loudness, beats loudness bass [40] Tonal Transposed and untransposed harmonic pitch class profiles, key strength [35], [42] Tuning frequency [42] Dissonance [35], [43] Chord change rate [35] Chords histogram, equal tempered deviations, non-tempered/tempered energy ratio, diatonic strength [40] Miscellaneous Average loudness [37] Zero-crossing rate [13], [37] 1) Euclidean distance based on principal component analysis (L 2 -PCA): As a starting point, we follow the ideas proposed by Cano et al [19], and apply an unweighted Euclidean metric on a manually selected subset of the descriptors outlined above 6 . This subset includes bark bands, pitch, spectral centroid, spread, kurtosis, rolloff, decrease, skewness, high-frequency content, spectral complexity, spectral crest, flatness, flux, spectral energy, energy bands, strong peak, tristimulus, inharmonicity, odd to even harmonic energy ratio, beats loudness, beats loudness bass, untransposed harmonic pitch class profiles, key strength, average loudness, and zerocrossing rate.…”
Section: Descriptor Groupmentioning
confidence: 99%
See 2 more Smart Citations