This is the accepted version of the paper.This version of the publication may differ from the final published version. In musicology and music research generally, the increasing availability of digital music, storage capacities and computing power both enable and require new and intelligent systems. In the transition from traditional to digital musicology, many techniques and tools have been developed for the analysis of individual pieces of music, but large scale music data that are increasingly becoming available require research methods and systems that work on the collection-level and at scale. Although many relevant algorithms have been developed during the last 15 years of research in Music Information Retrieval, an integrated system that supports large-scale digital musicology research has so far been lacking. In the Digital Music Lab (DML) project, a collaboration between music librarians, musicologists, computer scientists, and human-computer interface specialists, the DML software system has been developed that fills this gap by providing intelligent large-scale music analysis with a user-friendly interactive interface supporting musicologists in their exploration and enquiry. The DML system empowers musicologists by addressing several challenges: distributed processing of audio and other music data, management of the data analysis process and results, remote analysis of data under copyright, logical inference on the extracted information and metadata, and visual web-based interfaces for exploring and querying the music collections. The DML system is scalable and based on Semantic Web technology and integrates into Linked Data with the vision of a distributed system that enabling music research across archives, libraries and other providers of music data. A first DML system prototype has been set up in collaboration with the British Library and I Like Music Ltd. This system has been used to analyse a diverse corpus of currently 250,000 music tracks. In this article we describe the DML system requirements, design, architecture, components, available data sources, explaining their interaction. We report use cases and applications with initial evaluations of the proposed system.
Permanent repository link