This work advances the music emotion recognition state-of-the-art by proposing novel emotionally-relevant audio features. 5We reviewed the existing audio features implemented in well-known frameworks and their relationships with the eight commonly 6 defined musical concepts. This knowledge helped uncover musical concepts lacking computational extractors, to which we propose 7 algorithms -namely related with musical texture and expressive techniques. To evaluate our work, we created a public dataset of 900 8 audio clips, with subjective annotations following Russell's emotion quadrants. The existent audio features (baseline) and the proposed 9 features (novel) were tested using 20 repetitions of 10-fold cross-validation. Adding the proposed features improved the F1-score to 10 76.4 percent (by 9 percent), when compared to a similar number of baseline-only features. Moreover, analysing the features relevance 11 and results uncovered interesting relations, namely the weight of specific features and musical concepts to each emotion quadrant, and 12 warrant promising new directions for future research in the field of music emotion recognition, interactive media, and novel music 13 interfaces.
The design of meaningful audio features is a key need to advance the state-of-the-art in Music Emotion Recognition (MER). This work presents a survey on the existing emotionally-relevant computational audio features, supported by the music psychology literature on the relations between eight musical dimensions (melody, harmony, rhythm, dynamics, tone color, expressivity, texture and form) and specific emotions. Based on this review, current gaps and needs are identified and strategies for future research on feature engineering for MER are proposed, namely ideas for computational audio features that capture elements of musical form, texture and expressivity that should be further researched. Finally, although the focus of this article is on classical feature engineering methodologies (based on handcrafted features), perspectives on deep learning-based approaches are discussed.
This research addresses the role of lyrics in the music emotion recognition process. Our approach is based on several state of the art features complemented by novel stylistic, structural and semantic features. To evaluate our approach, we created a ground truth dataset containing 180 song lyrics, according to Russell's emotion model. We conduct four types of experiments: regression and classification by quadrant, arousal and valence categories. Comparing to the state of the art features (ngrams -baseline), adding other features, including novel features, improved the F-measure from 69.9%, 82.7% and 85.6% to 80.1%, 88.3% and 90%, respectively for the three classification experiments. To study the relation between features and emotions (quadrants) we performed experiments to identify the best features that allow to describe and discriminate each quadrant. To further validate these experiments, we built a validation set comprising 771 lyrics extracted from the AllMusic platform, having achieved 73.6% F-measure in the classification by quadrants. We also conducted experiments to identify interpretable rules that show the relation between features and emotions and the relation among features. Regarding regression, results show that, comparing to similar studies for audio, we achieve a similar performance for arousal and a much better performance for valence.
We propose a novel approach to music emotion recognition by combining standard and melodic features extracted directly from audio. To this end, a new audio dataset organized similarly to the one use in MIREX mood task comparison was created. From the data, 253 standard and 98 melodic features are extracted and used with several supervised learning techniques. Results show that generally melodic features perform better than standard audio. The best result, 64% f-measure, was obtained with only 11 features (9 melodic and 2 standard), obtained with ReliefF feature selection and support vector machines.
Remote monitoring of health parameters is a promising approach to improve the health condition and quality of life of particular groups of the population, which can also alleviate the current expenditure and demands of healthcare systems. The elderly, usually affected by chronic comorbidities, are a specific group of the population that can strongly benefit from telehealth technologies, allowing them to reach a more independent life, by living longer in their own homes. Usability of telehealth technologies and their acceptance by end-users are essential requirements for the success of telehealth implementation. Older people are resistant to new technologies or have difficulty in using them due to vision, hearing, sensory and cognition impairments. In this paper, we describe the implementation of an IoT-based telehealth solution designed specifically to address the elderly needs. The end-user interacts with a TV-set to record biometric parameters, and to receive warning and recommendations related to health and environmental sensor recordings. The familiarization of older people with the TV is expected to provide a more user-friendly interaction ensuring the effectiveness integration of the end-user in the overall telehealth solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.