Michael Neri scite author profile

The objective of a sound event detector is to recognize anomalies in an audio clip and return their onset and offset. However, detecting sound events in noisy environments is a challenging task. This is due to the fact that in a real audio signal several sound sources co-exist and that the characteristics of polyphonic audio are different from isolated recordings. It is also necessary to consider the presence of noise (e.g. thermal and environmental). In this contribution, we present a sound anomaly detection system based on a fully convolutional network which exploits image spatial filtering and an Atrous Spatial Pyramid Pooling module. To cope with the lack of datasets specifically designed for sound event detection, a dataset for the specific application of noisy bus environments has been designed. The dataset has been obtained by mixing background audio files, recorded in a real environment, with anomalous events extracted from monophonic collections of labelled audios. The performances of the proposed system have been evaluated through segment-based metrics such as error rate, recall, and F1-Score. Moreover, robustness and precision have been evaluated through four different tests. The analysis of the results shows that the proposed sound event detector outperforms both state-of-the-art methods and general purpose deep learning-solutions.

show abstract

No-Reference Light Field Image Quality Assessment Exploiting Saliency

Lamichhane

Neri

Battisti

et al. 2023

IEEE Trans. on Broadcast.

View full text Add to dashboard Cite

In the near future, the broadcasting scenario will be characterized by immersive content. One of the systems for capturing the 3D content of a scene is the Light Field imaging. The huge amount of data and the specific transmission scenario impose strong constraints on services and applications. Among others, the evaluation of the quality of the received media cannot rely on the original signal but should be based only on the received data. In this direction, we propose a no-reference quality metric for light field images which is based on spatial and angular characteristics. In more details, the estimated saliency and cyclopean maps of light field images are exploited to extract the spatial features. The angular consistency features are, instead, measured with the use of the Global Luminance Distribution knowledge and the Weighted Local Binary Patterns operator on Epipolar Plane Images. The effectiveness of the proposed metric is assessed by comparing its performance with state-of-the-art quality metrics using 4 datasets: SMART, Win5-LID, VALID 10-bit, and VALID 8-bit. Furthermore, the performance is analyzed in crossdatasets, with different distortions, and for different saliency maps. The achieved results show that the performance of the proposed model outperforms state-of-the-art approaches and perform well for different distortion types and with various saliency models.

show abstract

ParalMGC: Multiple Audio Representations for Synthetic Human Speech Attribution

Neri

Ferrarotti

Luisa

et al. 2022

View full text Add to dashboard Cite

3D Object Detection on Synthetic Point Clouds for Railway Applications

Neri

Battisti

2022

View full text Add to dashboard Cite

A Semantic Segmentation-based Approach for Train Positioning

Baldoni¹,

Battisti²,

Brizzi³

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michael Neri

Sound Event Detection for Human Safety and Security in Noisy Environments

No-Reference Light Field Image Quality Assessment Exploiting Saliency

ParalMGC: Multiple Audio Representations for Synthetic Human Speech Attribution

3D Object Detection on Synthetic Point Clouds for Railway Applications

A Semantic Segmentation-based Approach for Train Positioning

Contact Info

Product

Resources

About