Abstract:Voice-activated artificial intelligence (AI) technology has advanced rapidly and is being adopted in various devices such as smart speakers and display products, which enable users to multitask without touching the devices. However, most devices equipped with cameras and displays lack mobility; therefore, users cannot avoid touching them for face-to-face interactions, which contradicts the voice-activated AI philosophy. In this paper, we propose a deep neural network-based real-time sound source localization (… Show more
“…The authors investigated voice-activated AI technology and proposed a deep-neural-network-based real-time sound source localization (SSL) model for low-power IoT devices. The authors used multichannel acoustic data to parallelize convolutional neural network layers in the form of multiple streams in order to capture unique delay patterns in the low-, mid-, and high-frequency ranges and estimate the fine and coarse location of voices [ 58 ].…”
The widespread use of the internet and the exponential growth in small hardware diversity enable the development of Internet of things (IoT)-based localization systems. We review machine-learning-based approaches for IoT localization systems in this paper. Because of their high prediction accuracy, machine learning methods are now being used to solve localization problems. The paper’s main goal is to provide a review of how learning algorithms are used to solve IoT localization problems, as well as to address current challenges. We examine the existing literature for published papers released between 2020 and 2022. These studies are classified according to several criteria, including their learning algorithm, chosen environment, specific covered IoT protocol, and measurement technique. We also discuss the potential applications of learning algorithms in IoT localization, as well as future trends.
“…The authors investigated voice-activated AI technology and proposed a deep-neural-network-based real-time sound source localization (SSL) model for low-power IoT devices. The authors used multichannel acoustic data to parallelize convolutional neural network layers in the form of multiple streams in order to capture unique delay patterns in the low-, mid-, and high-frequency ranges and estimate the fine and coarse location of voices [ 58 ].…”
The widespread use of the internet and the exponential growth in small hardware diversity enable the development of Internet of things (IoT)-based localization systems. We review machine-learning-based approaches for IoT localization systems in this paper. Because of their high prediction accuracy, machine learning methods are now being used to solve localization problems. The paper’s main goal is to provide a review of how learning algorithms are used to solve IoT localization problems, as well as to address current challenges. We examine the existing literature for published papers released between 2020 and 2022. These studies are classified according to several criteria, including their learning algorithm, chosen environment, specific covered IoT protocol, and measurement technique. We also discuss the potential applications of learning algorithms in IoT localization, as well as future trends.
“…As an alternative to signal processing based DoA estimation methods, end-to-end ML models eliminate feature extraction step and enable lowcomputation DoA prediction. However, the best offerings of the current literature focus on platforms with significantly higher power consumption than battery-powered nodes used in low-cost IoT-based noise monitoring systems [45], [46].…”
Static noise maps depicting long-term noise levels over wide areas are valuable urban planning assets for municipalities in decreasing noise exposure of residents. However, non-traffic noise sources with transient behavior, which people complain frequently, are usually ignored by static maps. We propose here a dynamic noise mapping approach using the data collected via low-power wide-area network (LPWAN, specifically LoRaWAN) based internet of things (IoT) infrastructure, which is one of the most common communication backbones for smart cities. Noise mapping based on LPWAN is challenging due to the low data rates of these protocols. The proposed dynamic noise mapping approach diminishes the negative implications of data rate limitations using machine learning (ML) for event and location prediction of non-traffic sources based on the scarce data. The strength of these models lies in their consideration of the spatial variance in acoustic behavior caused by the buildings in urban settings. The effectiveness of the proposed method and the accuracy of the resulting dynamic maps are evaluated in field tests. The results show that the proposed system can decrease the map error caused by non-traffic sources up to 51% and can stay effective under significant packet losses.
“…The flying drone localizes the sound source from ambient noise and frequent movement [ 19 ]. The low-power device with a microphone array derives the AoA for the sound source in the application of various human interaction [ 20 ]. The complex environment from the indoor condition is challenged for the SSL system by Machhamer [ 21 ] and Zhang [ 22 ].…”
To extract the phase information from multiple receivers, the conventional sound source localization system involves substantial complexity in software and hardware. Along with the algorithm complexity, the dedicated communication channel and individual analog-to-digital conversions prevent an increase in the system’s capability due to feasibility. The previous study suggested and verified the single-channel sound source localization system, which aggregates the receivers on the single analog network for the single digital converter. This paper proposes the improved algorithm for the single-channel sound source localization system based on the Gaussian process regression with the novel feature extraction method. The proposed system consists of three computational stages: homomorphic deconvolution, feature extraction, and Gaussian process regression in cascade. The individual stages represent time delay extraction, data arrangement, and machine prediction, respectively. The optimal receiver configuration for the three-receiver structure is derived from the novel similarity matrix analysis based on the time delay pattern diversity. The simulations and experiments present precise predictions with proper model order and ensemble average length. The nonparametric method, with the rational quadratic kernel, shows consistent performance on trained angles. The Steiglitz–McBride model with the exponential kernel delivers the best predictions for trained and untrained angles with low bias and low variance in statistics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.