Choral singing in the soprano, alto, tenor and bass (SATB) format is a widely practiced and studied art form with significant cultural importance. Despite the popularity of the choral setting, it has received little attention in the field of Music Information Retrieval. However, the recent publication of high-quality choral singing datasets as well as recent developments in deep learning based methodologies applied to the field of music and speech processing, have opened new avenues for research in this field. In this paper, we use some of the publicly available choral singing datasets to train and evaluate state-of-the-art source separation algorithms from the speech and music domains for the case of choral singing. Furthermore, we evaluate existing monophonic F0 estimators on the separated unison stems and propose an approximation of the perceived F0 of a unison signal. Additionally, we present a set of applications combining the proposed methodologies, including synthesizing a single singer voice from the unison, and transposing and remixing the separated stems into a synthetic multi-singer choral signal. We finally conduct a set of listening tests to perform a perceptual evaluation of the results we obtain with the proposed methodologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.