Significant progress has been made in symbolic music generation with the help of deep learning techniques. However, the tasks covered by symbolic music generation have not been well summarized, and the evolution of generative models for the specific music generation task has not been illustrated systematically. This paper attempts to provide a task-oriented survey of symbolic music generation based on deep learning techniques, covering most of the currently popular music generation tasks. The distinct models under the same task are set forth briefly and strung according to their motivations, basically in chronological order. Moreover, we summarize the common datasets suitable for various tasks, discuss the music representations and the evaluation methods, highlight current challenges in symbolic music generation, and finally point out potential future research directions.
Regional style in Chinese folk songs is a rich treasure that can be used for ethnic music creation and folk culture research. In this paper, we propose MG-VAE, a music generative model based on VAE (Variational Auto-Encoder) that is capable of capturing specific music style and generating novel tunes for Chinese folk songs (Min Ge) in a manipulatable way. Specifically, we disentangle the latent space of VAE into four parts in an adversarial training way to control the information of pitch and rhythm sequence, as well as of music style and content. In detail, two classifiers are used to separate style and content latent space, and temporal supervision is utilized to disentangle the pitch and rhythm sequence. The experimental results show that the disentanglement is successful and our model is able to create novel folk songs with controllable regional styles. To our best knowledge, this is the first study on applying deep generative model and adversarial training for Chinese music generation. 2 Related Work RNN (Recurrent Neural Network) is one of the most earliest models introduced into the domain of deep music generation. Researchers employ RNNs to model the music structure and generate different formats of music, including monophonic folk melodies [30], rhythm composition [23], expressive music performance [27], multi-part music harmonization [31]. Other recent studies have
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.