We propose a generative model for music extension in this paper. The model is composed of two classifiers, one for music emotion and one for music tonality, and a generative adversarial network (GAN). Therefore, it can generate symbolic music not only based on low level spectral and temporal characteristics, but also on high level emotion and tonality attributes of previously observed music pieces. The generative model works in a universal latent space constructed by the variational autoencoder (VAE) for representing music pieces. We conduct subjective listening tests and derive objective measures for performance evaluation. Experimental results show that the proposed model produces much smoother and more authentic music pieces than the baseline model in terms of all subjective and objective measures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.