The musical analysis of large-scale structures, such as the classical sonata form, requires to integrate multiple analyses of local musical events into a global coherent analysis. Modelling large-scale structures is still a challenging task for the research community. It includes building large and accurate annotated corpora, as well as developing practical and efficient tools in order to visualize the analyses of these corpora. It finally requires the conception of effective and properly evaluated MIR algorithms. We propose a machine learning approach for the sonata form structure on 32 movements from Mozart's string quartets. We release an open dataset, encoding two reference analyses of these 32 movements, totaling more than 1800 curated annotations, as well as flexible visualizations of these analyses. We discuss the occurrence in this corpus of melodic, harmonic, and rhythmic features induced by pitches, durations, and rests. We investigate whether the presence or the absence of these features can be characteristic of the different sections forming a sonata form. We then compute the emission and transition probabilities of several Hidden Markov Models intended to match the structure of sonata forms at several resolutions. Our results confirm that the sonata form is better identified when the parameters are learned rather than manually set up. These results open perspectives on the computational analysis of musical forms by mixing human knowledge and machine learning from annotated scores.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.