Music offers a uniquely abstract way for the expression of human emotions and moods, wherein melodic harmony is achieved through a succinct blend of pitch, rhythm, tempo, texture, and other sonic qualities. The emerging field of “Robotic Musicianship” focuses on developing machine intelligence, in terms of algorithms and cognitive models, to capture the underlying principles of musical perception, composition, and performance. The capability of new-generation robots to manifest music in a human-like artistically expressive manner lies at the intersection of engineering, computers, music, and psychology; promising to offer new forms of creativity, sharing, and interpreting musical impulses. This manuscript explores how real-time collaborations between humans and machines might be achieved by the integration of technological and mathematical models from Synchronization and Learning, with precise configuration for the seamless generation of melody in tandem, towards the vision of human–robot symphonic orchestra. To explicitly capture the key ingredients of a good symphony—synchronization and anticipation—this work discusses a possible approach based on the joint strategy of: (i) Mapping— wherein mathematical models for oscillator coupling like Kuramoto could be used for establishing and maintaining synchronization, and (ii) Modelling—employing modern deep learning predictive models like Neural Network architectures to anticipate (or predict) future state changes in the sequence of music generation and pre-empt transitions in the coupled oscillator sequence. It is hoped that this discussion will foster new insights and research for better “real-time synchronized human-computer collaborative interfaces and interactions”.
This paper introduces a new method for detecting onsets, offsets, and transitions of the notes in real-time solo singing performances. It identifies the onsets and offsets by finding the transitions from one note to another by considering trajectory changes in the fundamental frequencies. The accuracy of our approach is compared with eight well-known algorithms. It was tested with two datasets that contained 130 files of singing. The total duration of the datasets was more than seven hours and had more than 41,000 onset annotations. The analysis metrics used include the Average, the F-Measure Score, and ANOVA. The proposed algorithm was observed to determine onsets and offsets more accurately than the other algorithms. Additionally, unlike the other algorithms, the proposed algorithm can detect the transitions between notes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.