Embedding sound localization and spatial audio interaction through coincident microphones arrays

Vryzas, Nikolaos; Dimoulas, Charalampos; Papanikolaou, George

doi:10.1145/2814895.2814917

Cited by 6 publications

(1 citation statement)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Circular microphone arrays are quite common for recording multichannel audio, as they encourage the exploitation of spatial information contained in complex acoustic environments [60]. Techniques, such as independent component analysis, adaptive filtering, and beamforming, have long demonstrated the power of spatially-aware systems in localizing sound sources [61,62] and detecting audio events [63]. However, the majority of spatial filtering techniques require knowledge of the recording setup and are usually based on statistical assumptions that are not always met in real-world conditions, especially when multiple sound sources are present.…”

Section: Data Augmentation For Spatial Invariancementioning

confidence: 99%

Semi-Supervised Machine Condition Monitoring by Learning Deep Discriminative Audio Features

2021

View full text Add to dashboard Cite

In this study, we aim to learn highly descriptive representations for a wide set of machinery sounds and exploit this knowledge to perform condition monitoring of mechanical equipment. We propose a comprehensive feature learning approach that operates on raw audio, by supervising the formation of salient audio embeddings in latent states of a deep temporal convolutional neural network. By fusing the supervised feature learning approach with an unsupervised deep one-class neural network, we are able to model the characteristics of each source and implicitly detect anomalies in different operational states of industrial machines. Moreover, we enable the exploitation of spatial audio information in the learning process, by formulating a novel front-end processing strategy for circular microphone arrays. Experimental results on the MIMII dataset demonstrate the effectiveness of the proposed method, reaching a state-of-the-art mean AUC score of 91.0%. Anomaly detection performance is significantly improved by incorporating multi-channel audio data in the feature extraction process, as well as training the convolutional neural network on the spatially invariant front-end. Finally, the proposed semi-supervised approach allows the concise modeling of normal machine conditions and accurately detects system anomalies, compared to existing anomaly detection methods.

show abstract