Onset detection still has room for improvement, especially when dealing with polyphonic music signals. For certain purposes in which the correctness of the result is a must, user intervention is hence required to correct the mistakes performed by the detection algorithm. In such interactive paradigm, the exactitude of the detection can be guaranteed at the expense of user's work, being the effort required to accomplish the task, the value that has to be both quantified and reduced. The present work studies the idea of interactive onset detection and proposes a methodology for assessing the user's workload, as well as a set of interactive schemes for reducing such workload when carrying out this detection task. Results show that the evaluation strategy proposed is able to quantitatively assess the invested user effort. Also, the presented interactive schemes significantly facilitate the correction task compared with the manual annotation.
Keywords: User interaction, Onset detection, Information retrieval, Audio analysisMusic signals may be decomposed into sound objects by means of signal processing techniques. Note events constitute an example of musical signal segmentation, and can be defined by both the moment the note startsthe onset-and its end-the offset [ [8,9]. To check the performance of current stateof-the-art onset detection methods, the reader is referred to the results obtained in the annual Music Information Retrieval Evaluation eXchange (MIREX) contest.The results obtained by current state of the art may be considered sufficiently accurate for applications such as audio structure analysis or digital audio effects, in which onset information simply constitutes a support information for the task rather than its main description. Nevertheless, for specific cases as note tracking in automatic music transcription, the preciseness of onset events remarkably influences the overall success of the task.Note that, while onset estimation is generally used as an intermediate process within more complex MIR systems, this task may be also considered as a goal by itself. As an example, the work in [10] contemplates the use of onset information for identifying music pieces by comparing timing deviations between estimated onsets from interpretations of the pieces and its reference annotations from the scores.The aforementioned cases constitute particular examples in which very precise onset times are required. Generally, research in such cases implies the manual annotation of corpora since no single onset estimation Audio, Speech, and Music Processing (2017) 2017:15 Page 2 of 14 algorithm guarantees a perfect retrieval. Whereas this performance limitation is inherent to any research topic, some authors in the MIR community suggest that a glass ceiling is being reached, at least in the case of some commonly addressed tasks [11,12]. It thus appears interesting to explore alternative research paradigms that are capable of dealing with these limitations.
Valero-Mas and Iñesta EURASIP Journal onThe so-called...