This article describes a user-driven adaptive method for controlling the sonic response of digital musical instruments with information extracted from the timbre of the human voice. The mapping between heterogeneous attributes of the input and output timbres is determined from data collected via machine listening techniques and then processed by unsupervised machine learning algorithms. This approach is based on a minimum-loss mapping which hides any synthesizer-specific parameters, and maps the vocal interaction directly to perceptual characteristics of the generated sound.The mapping adapts to the dynamics detected in the voice, and maximizes the timbral space covered by the sound synthesizer. The strategies for mapping vocal control to perceptual timbral features, and for automating the customization of vocal interfaces for different users and synthesizers in general, are evaluated through a variety of qualitative and quantitative methods.
Recent methodologies for audio classification frequently involve cepstral and spectral features, applied to single channel recordings of acoustic scenes and events. Further, the concept of transfer learning has been widely used over the years, and has proven to provide an efficient alternative to training neural networks from scratch. The lower time and resource requirements when using pre-trained models allows for more versatility in developing system classification approaches. However, information on classification performance when using different features for multi-channel recordings is often limited. Furthermore, pre-trained networks are initially trained on bigger databases and are often unnecessarily large. This poses a challenge when developing systems for devices with limited computational resources, such as mobile or embedded devices. This paper presents a detailed study of the most apparent and widely-used cepstral and spectral features for multi-channel audio applications. Accordingly, we propose the use of spectro-temporal features. Additionally, the paper details the development of a compact version of the AlexNet model for computationally-limited platforms through studies of performances against various architectural and parameter modifications of the original network. The aim is to minimize the network size while maintaining the series network architecture and preserving the classification accuracy. Considering that other state-of-the-art compact networks present complex directed acyclic graphs, a series architecture proposes an advantage in customizability. Experimentation was carried out through Matlab, using a database that we have generated for this task, which composes of four-channel synthetic recordings of both sound events and scenes. The top performing methodology resulted in a weighted F1-score of 87.92% for scalogram features classified via the modified AlexNet-33 network, which has a size of 14.33 MB. The AlexNet network returned 86.24% at a size of 222.71 MB.
Higher education is facing disruptive changes in many fields. Students wants to have the option of learning anywhere, anytime and in any format. Universities need to develop and deliver to future students a complete learning ecosystem. At the same time universities are facing challenges such as growing costs and the pressure to give the students the knowledge, competence, skills and ability to continuously adapt to future job environments. As a consequence, many universities are investigating new ways of collaboration and sharing resources to cater to the demands of students, industry and society. An example of this collaboration is a new joint master between the two largest Universities in Norway: University of Oslo (UiO) and Norwegian University of Science and Technology (NTNU). In this paper, we present the lessons learned from almost two years of teaching and learning in the new joint master's programme, "Music, Communication and Technology" (MCT), between NTNU and UiO. This programme is a run in a two-campus learning space built as a two-way, audio-visual, high-quality, low-latency communication channel between the two campuses, called "The Portal". Moreover, MCT is the subject of research for the SALTO (Student Active Learning in a Two campus Organisation) project, where novel techniques in teaching and learning are explored, such as team-based learning (TBL), flipped classroom, and other forms of student active learning. Educational elements in this master, provides the student with 21st century skills and deliver knowledge within humanities, entrepreneurship and technology. We elaborate on the technical, pedagogical and learning space-related challenges toward delivering teaching and learning in these cross-university settings. The paper concludes with a set of strategies that can be used to improve student active learning in different scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.