Reference annotation datasets containing harmony annotations are at the core of a wide range of studies in music information retrieval (MIR) and related fields. The majority of these datasets contain single reference annotations describing the harmony of each piece. Nevertheless, studies showing differences among annotators in many other MIR tasks make the notion of a single 'ground-truth' reference annotation a tenuous one. In this paper, we introduce and analyse the Chordify Annotator Subjectivity Dataset (CASD) containing chord labels for 50 songs from 4 expert annotators in order to gain a better understanding of the differences between annotators in their chord label choice. Our analysis reveals that annotators use distinct chord-label vocabularies, with low chord-label overlap across all annotators. Between annotators, we find only 73 percent overlap on average for the traditional major-minor vocabulary and 54 percent overlap for the most complex chord labels. A factor analysis reveals the relative importance of triads, sevenths, inversions and other musical factors for each annotator on their choice of chord labels and reported difficulty of the songs. Our results further substantiate the existence of a harmonic 'subjectivity ceiling': an upper bound for evaluations in computational harmony research. Current state-of-the-art chord-estimation systems perform beyond this subjectivity ceiling by about 10 percent. This suggests that current ACE algorithms are powerful enough to tune themselves to particular annotators' idiosyncrasies. Overall, our results show that annotator subjectivity is an important factor in harmonic transcriptions, which should inform future studies into harmony perception and computational models of harmony.
Melody harmonisation is a centuries-old problem of long tradition, and a core aspect of composition in Western tonal music. In this work we describe FHARM, an automated system for melody harmonisation based on a functional model of harmony. Our system first generates multiple harmonically well-formed chord sequences for a given melody. From the generated sequences, the best one is chosen, by picking the one with the smallest deviation from the harmony model. Unlike all existing systems, FHARM guarantees that the generated chord sequences follow the basic rules of tonal harmony. We carry out two experiments to evaluate the quality of our harmonisations. In one experiment, a panel of harmony experts is asked to give its professional opinion and rate the generated chord sequences for selected melodies. In another experiment, we generate a chord sequence for a selected melody, and compare the result to the original harmonisation given by a harmony scholar. Our experiments confirm that FHARM generates realistic chords for each melody note. However, we also conclude that harmonising a melody with individually well-formed chord sequences from a harmony model does not guarantee a well-sounding coherence between the chords and the melody. We reflect on the experience gained with our experiment, and propose future improvements to refine the quality of the harmonisation.
oeThe increasing accuracy of automatic chord estimation systems, the availability of vast amounts of heterogeneous reference annotations, and insights from annotator subjectivity research make chord label personalization increasingly important. Nevertheless, automatic chord estimation systems are historically exclusively trained and evaluated on a single reference annotation. We introduce a first approach to automatic chord label personalization by modeling subjectivity through deep learning of a harmonic interval-based chord label representation. After integrating these representations from multiple annotators, we can accurately personalize chord labels for individual annotators from a single model and the annotators' chord label vocabulary. Furthermore, we show that chord personalization using multiple reference annotations outperforms using a single reference annotation.
We present a study on automatic birdsong recognition with deep neural networks using the birdclef2014 dataset. Through deep learning, feature hierarchies are learned that represent the data on several levels of abstraction. Deep learning has been applied with success to problems in fields such as music information retrieval and image recognition, but its use in bioacoustics is rare. Therefore, we investigate the application of a common deep learning technique (deep neural networks) in a classification task using songs from Amazonian birds. We show that various deep neural networks are capable of outperforming other classification methods. Furthermore, we present an automatic segmentation algorithm that is capable of separating bird sounds from non-bird sounds.
Machine learning is challenging the way we make music. Although research in deep generative models has dramatically improved the capability and fluency of music models, recent work has shown that it can be challenging for humans to partner with this new class of algorithms. In this paper, we present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song with AI, the challenges they faced, and how they leveraged and repurposed existing characteristics of AI to overcome some of these challenges. Many teams adopted modular approaches, such as independently running multiple smaller models that align with the musical building blocks of a song, before re-combining their results. As ML models are not easily steerable, teams also generated massive numbers of samples and curated them post-hoc, or used a range of strategies to direct the generation, or algorithmically ranked the samples. Ultimately, teams not only had to manage the "flare and focus" aspects of the creative process, but also juggle them with a parallel process of exploring and curating multiple ML models and outputs. These findings reflect a need to design machine learning-powered music interfaces that are more decomposable, steerable, interpretable, and adaptive, which in return will enable artists to more effectively explore how AI can extend their personal expression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.