This paper presents a gesture recognition/adaptation system for Human Computer Interaction applications that goes beyond activity classification and that, complementary to gesture labeling, characterizes the movement execution. We describe a template-based recognition method that simultaneously aligns the input gesture to the templates using a Sequential Montecarlo inference technique. Contrary to standard templatebased methods based on dynamic programming, such as Dynamic Time Warping, the algorithm has an adaptation process that tracks gesture variation in real-time. The method continuously updates, during execution of the gesture, the estimated parameters and recognition results which offers key advantages for continuous human-machine interaction. The technique is evaluated in several different ways: recognition and early recognition are evaluated on a 2D onscreen pen gestures; adaptation is assessed on synthetic data; and both early recognition and adaptation is evaluation in a user study involving 3D free space gestures. The method is not only robust to noise and successfully adapts to parameter variation but also performs recognition as well or better than non-adapting offline template-based methods.
We present a methodology for the real time alignment of mu sic signals using sequential Montecarlo inference techniques. The alignment problem is formulated as the state tracking of a dynamical system, and differs from traditional Hidden Markov Model -Dynamic Time Warping based systems in that the hidden state is continuous rather than discrete. The major contribution of this paper is addressing both problems of audio-to-score and audio-to-audio alignment within the same framework in a real time setting. Performances of the proposed methodology on both problems are then evaluated and discussed.
The behavior of users of music streaming services is investigated from the point of view of the temporal dimension of individual songs. Specifically, the main object of the analysis is the point in time within a song at which users stop listening and start streaming another song ("skip"). The main contribution of this study is the ascertainment of a correlation between the distribution in time of skipping events and the musical structure of songs. It is also shown that such distribution is not only specific to the individual songs, but also independent of the cohort of users and date of observation. Finally, user behavioral data is used to train a predictor of the musical structure of a song solely from its acoustic content; it is shown that the use of such data, available in large quantities to music streaming services, yields significant improvements in accuracy over the customary fashion of training this class of algorithms, in which only smaller amounts of hand-labeled data are available.
A comprehensive methodology for automatic music identification is presented. The main application of the proposed approach is to provide tools to enrich and validate the descriptors of recordings digitized by a sound archive institution. Experimentation has been carried out on three different datasets, including a collection of digitized vinyl discs, although the methodology is not linked to a particular recording carrier. Automatic identification allows a music digital library to retrieve metadata about music works even if the information was incomplete or missing at the time of the acquisition. Automatic segmentation of digitized material is obtained as a byproduct of identification, allowing the music digital library to grant access to individual tracks, even if discs are digitized using a single file for a complete disc side. Results show that the approach is both efficient and effective
This paper describes the implementation of a content-based cover song identification system which has been released under an open source license. The system is centered around the Apache Lucene text search engine library, and proves how classic techniques derived from textual Information Retrieval, in particular the bag-of-words paradigm, can successfully be adapted to music identification. The paper focuses on extensive experimentation on the most influential system parameters, in order to find an optimal tradeoff between retrieval accuracy and speed of querying
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.