Tempo and genre are two inter-leaved aspects of music, genres are often associated to rhythm patterns which are played in specific tempo ranges.In this paper, we focus on the Deep Rhythm system based on a harmonic representation of rhythm used as an input to a convolutional neural network.To consider the relationships between frequency bands, we process complex-valued inputs through complex-convolutions.We also study the joint estimation of tempo/genre using a multitask learning approach. Finally, we study the addition of a second input convolutional branch to the system applied to a mel-spectrogram input dedicated to the timbre.This multi-input approach allows to improve the performances for tempo and genre estimation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.