The problem of note onset detection in musical signals is considered. The proposed solution is based on known approaches in which an onset detection function is defined on the basis of spectral characteristics of audio data. In our approach, several onset detection functions are used simultaneously to form an input vector for a multi-layer non-linear perceptron, which learns to detect onsets in the training data. This is in contrast to standard methods based on thresholding the onset detection functions with a moving average or a moving median. Our approach is also different from most of the current machinelearning-based solutions in that we explicitly use the onset detection functions as an intermediate representation, which may therefore be easily replaced with a different one, e.g., to match the characteristics of a particular audio data source. The results obtained for a database containing annotated onsets for 17 different instruments and ensembles are compared with state-of-the-art solutions.
Abstract-In this paper a convolutional neural network is applied to the problem of note onset detection in audio recordings. Two time-frequency representations are analysed, showing the superiority of standard spectrogram over enhanced autocorrelation (EAC) used as the input to the convolutional network. Experimental evaluation is based on a dataset containing 10,939 annotated onsets, with total duration of the audio recordings of over 45 min.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.