Environmental Sound Recognition has become a relevant application for smart cities. Such an application, however, demands the use of trained machine learning classifiers in order to categorize a limited set of audio categories. Although classical machine learning solutions have been proposed in the past, most of the latest solutions that have been proposed toward automated and accurate sound classification are based on a deep learning approach. Deep learning models tend to be large, which can be problematic when considering that sound classifiers often have to be embedded in resource constrained devices. In this paper, a classical machine learning based classifier called MosAIc, and a lighter Convolutional Neural Network model for environmental sound recognition, are proposed to directly compete in terms of accuracy with the latest deep learning solutions. Both approaches are evaluated in an embedded system in order to identify the key parameters when placing such applications on constrained devices. The experimental results show that classical machine learning classifiers can be combined to achieve similar results to deep learning models, and even outperform them in accuracy. The cost, however, is a larger classification time.
Acoustic cameras allow the visualization of sound sources using microphone arrays and beamforming techniques. The required computational power increases with the number of microphones in the array, the acoustic images resolution, and in particular, when targeting real-time. Such a constraint limits the use of acoustic cameras in many wireless sensor network applications (surveillance, industrial monitoring, etc.). In this paper, we propose a multi-mode System-on-Chip (SoC) Field-Programmable Gate Arrays (FPGA) architecture capable to satisfy the high computational demand while providing wireless communication for remote control and monitoring. This architecture produces real-time acoustic images of 240 × 180 resolution scalable to 640 × 480 by exploiting the multithreading capabilities of the hard-core processor. Furthermore, timing cost for different operational modes and for different resolutions are investigated to maintain a real time system under Wireless Sensor Networks constraints.
In recent years, Environmental Sound Recognition (ESR) has become a relevant capability for urban monitoring applications. The techniques for automated sound recognition often rely on machine learning approaches, which have increased in complexity in order to achieve higher accuracy. Nonetheless, such machine learning techniques often have to be deployed on resource and power-constrained embedded devices, which has become a challenge with the adoption of deep learning approaches based on Convolutional Neural Networks (CNNs). Field-Programmable Gate Arrays (FPGAs) are power efficient and highly suitable for computationally intensive algorithms like CNNs. By fully exploiting their parallel nature, they have the potential to accelerate the inference time as compared to other embedded devices. Similarly, dedicated architectures to accelerate Artificial Intelligence (AI) such as Tensor Processing Units (TPUs) promise to deliver high accuracy while achieving high performance. In this work, we evaluate existing tool flows to deploy CNN models on FPGAs as well as on TPU platforms. We propose and adjust several CNN-based sound classifiers to be embedded on such hardware accelerators. The results demonstrate the maturity of the existing tools and how FPGAs can be exploited to outperform TPUs.
Microphone arrays are gaining in popularity thanks to the availability of low-cost microphones. Applications including sonar, binaural hearing aid devices, acoustic indoor localization techniques and speech recognition are proposed by several research groups and companies. In most of the available implementations, the microphones utilized are assumed to offer an ideal response in a given frequency domain. Several toolboxes and software can be used to obtain a theoretical response of a microphone array with a given beamforming algorithm. However, a tool facilitating the design of a microphone array taking into account the non-ideal characteristics could not be found. Moreover, generating packages facilitating the implementation on Field Programmable Gate Arrays has, to our knowledge, not been carried out yet. Visualizing the responses in 2D and 3D also poses an engineering challenge. To alleviate these shortcomings, a scalable Cloud-based Acoustic Beamforming Emulator (CABE) is proposed. The non-ideal characteristics of microphones are considered during the computations and results are validated with acoustic data captured from microphones. It is also possible to generate hardware description language packages containing delay tables facilitating the implementation of Delay-and-Sum beamformers in embedded hardware. Truncation error analysis can also be carried out for fixed-point signal processing. The effects of disabling a given group of microphones within the microphone array can also be calculated. Results and packages can be visualized with a dedicated client application. Users can create and configure several parameters of an emulation, including sound source placement, the shape of the microphone array and the required signal processing flow. Depending on the user configuration, 2D and 3D graphs showing the beamforming results, waterfall diagrams and performance metrics can be generated by the client application. The emulations are also validated with captured data from existing microphone arrays.
Until recently data acquisition in integrated pest management (IPM) relied on manual collection of both pest and environmental data. Autonomous wireless sensor networks (WSN) are providing a way forward by reducing the need for manual offload and maintenance; however, there is still a significant gap in pest management using WSN with most applications failing to provide a low-cost, autonomous monitoring system that can operate in remote areas. In this study, we investigate the feasibility of implementing a reliable, fully independent, low-power WSN that will provide high-resolution, near-real-time input to a spatial decision support system (SDSS), capturing the small-scale heterogeneity needed for intelligent IPM. The WSN hosts a dual-uplink taking advantage of both satellite and terrestrial communication. A set of tests were conducted to assess metrics such as signal strength, data transmission and bandwidth of the SatCom module as well as mesh configuration, energetic autonomy, point to point communication and data loss of the WSN nodes. Finally, we demonstrate the SDSS output from two vector models forced by WSN data from a field site in Belgium. We believe that this system can be a cost-effective solution for intelligent IPM in remote areas where there is no reliable terrestrial connection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.