We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. The library is cross-platform and currently supports Linux, Mac OS X, and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.
The automatic assessment of music performance has become an area of increasing interest due to the growing number of technology-enhanced music learning systems. In most of these systems, the assessment of musical performance is based on pitch and onset accuracy, but very few pay attention to other important aspects of performance, such as sound quality or timbre. This is particularly true in violin education, where the quality of timbre plays a significant role in the assessment of musical performances. However, obtaining quantifiable criteria for the assessment of timbre quality is challenging, as it relies on consensus among the subjective interpretations of experts. We present an approach to assess the quality of timbre in violin performances using machine learning techniques. We collected audio recordings of several tone qualities and performed perceptual tests to find correlations among different timbre dimensions. We processed the audio recordings to extract acoustic features for training tone-quality models. Correlations among the extracted features were analyzed and feature information for discriminating different timbre qualities were investigated. A real-time feedback system designed for pedagogical use was implemented in which users can train their own timbre models to assess and receive feedback on their performances.
To enhance the experience of listening to classical orchestra music, either in the concert hall or at home, we present a personalized system that integrates three visualization/interaction concepts: Score Follower (points to the current position in the score), Orchestra Layout (illustrates instruments that are currently playing and their dynamics), and Structure Visualization (visualizes structural elements such as themes or motifs). Motivated by previous literature that found evidence for connections between personality and music consumption and preference, we first assessed in a user study to which extent personality traits and music visualization preferences correlate. Measuring preference via pragmatic quality and personality traits according to the Big Five Inventory (BFI) questionnaire, we found substantial interconnections between them. These translate into rules relating certain personality traits (e.g., extraversion or agreeableness) to preference rankings of the visualizations. In the proposed personality-based system, users are grouped into four clusters according to their answers to the most significant personality questions determined in the study. The order of the visualizations for a given user is adapted with respect to the ranking preferred by other users in the same cluster. Evaluation of the system was carried out by a second user study that showed a significantly higher normalized discounted cumulative gain (NDCG) for the personalized system in comparison to a system with randomized order of the visualizations.
In this paper, we provide a first-person outlook on the technical challenges and developments involved in the recording, analysis, archiving, and cloud-based interchange of multimodal string quartet performance data as part of a collaborative research project on ensemble music making. In order to facilitate the sharing of our own collection of multimodal recordings and extracted descriptors and annotations, we developed a hosting platform and data archival protocol through which multimodal data (audio, video, motion capture, descriptor signals) can be stored, visualized, annotated, and selectively retrieved via a web interface and a dedicated API. By way of this paper we make a twofold contribution: (a) we open our collection of enriched multimodal datasets to the community, the Quartet Dataset ; and (b) we introduce and enable access to our multimodal data exchange platform, the Repovizz system, through which users can upload recorded data, and navigate, playback, or edit existing datasets via a standard Internet browser.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.