Social tags inherent in online music services such as Last.fm provide a rich
source of information on musical moods. The abundance of social tags makes this
data highly beneficial for developing techniques to manage and retrieve mood
information, and enables study of the relationships between music content and
mood representations with data substantially larger than that available for
conventional emotion research. However, no systematic assessment has been done
on the accuracy of social tags and derived semantic models at capturing mood
information in music. We propose a novel technique called Affective Circumplex
Transformation (ACT) for representing the moods of music tracks in an
interpretable and robust fashion based on semantic computing of social tags and
research in emotion modeling. We validate the technique by predicting listener
ratings of moods in music tracks, and compare the results to prediction with
the Vector Space Model (VSM), Singular Value Decomposition (SVD), Nonnegative
Matrix Factorization (NMF), and Probabilistic Latent Semantic Analysis (PLSA).
The results show that ACT consistently outperforms the baseline techniques, and
its performance is robust against a low number of track-level mood tags. The
results give validity and analytical insights for harnessing millions of music
tracks and associated mood data available through social tags in application
development.Comment: Preprint, 14 page
Background and objectives
Music has a unique capacity to evoke both strong emotions and vivid autobiographical memories. Previous music information retrieval (MIR) studies have shown that the emotional experience of music is influenced by a combination of musical features, including tonal, rhythmic, and loudness features. Here, our aim was to explore the relationship between music-evoked emotions and music-evoked memories and how musical features (derived with MIR) can predict them both.
Methods
Healthy older adults (N = 113, age ≥ 60 years) participated in a listening task in which they rated a total of 140 song excerpts comprising folk songs and popular songs from 1950s to 1980s on five domains measuring the emotional (valence, arousal, emotional intensity) and memory (familiarity, autobiographical salience) experience of the songs. A set of 24 musical features were extracted from the songs using computational MIR methods. Principal component analyses were applied to reduce multicollinearity, resulting in six core musical components, which were then used to predict the behavioural ratings in multiple regression analyses.
Results
All correlations between behavioural ratings were positive and ranged from moderate to very high (r = 0.46–0.92). Emotional intensity showed the highest correlation to both autobiographical salience and familiarity. In the MIR data, three musical components measuring salience of the musical pulse (Pulse strength), relative strength of high harmonics (Brightness), and fluctuation in the frequencies between 200–800 Hz (Low-mid) predicted both music-evoked emotions and memories. Emotional intensity (and valence to a lesser extent) mediated the predictive effect of the musical components on music-evoked memories.
Conclusions
The results suggest that music-evoked emotions are strongly related to music-evoked memories in healthy older adults and that both music-evoked emotions and memories are predicted by the same core musical features.
This study investigates whether taking genre into account is beneficial for automatic music mood annotation in terms of core affects valence, arousal, and tension, as well as several other mood scales. Novel techniques employing genre-adaptive semantic computing and audio-based modelling are proposed. A technique called the ACTwg employs genre-adaptive semantic computing of mood-related social tags, whereas ACTwg-SLPwg combines semantic computing and audio-based modelling, both in a genre-adaptive manner. The proposed techniques are experimentally evaluated at predicting listener ratings related to a set of 600 popular music tracks spanning multiple genres. The results show that ACTwg outperforms a semantic computing technique that does not exploit genre information, and ACTwg-SLPwg outperforms conventional techniques and other genre-adaptive alternatives. In particular, improvements in the prediction rates are obtained for the valence dimension which is typically the most challenging core affect dimension for audio-based annotation. The specificity of genre categories is not crucial for the performance of ACTwg-SLPwg. The study also presents analytical insights into inferring a concise tag-based genre representation for genre-adaptive music mood analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.