AI perspectives in Smart Cities and Communities to enable road vehicle automation and smart traffic control

Englund, Cristofer; Aksoy, Eren Erdal; Alonso‐Fernandez, Fernando; Cooney, Martin; Pashami, Sepideh; Åstrand, Björn

doi:10.48550/arxiv.2104.03150

Cited by 1 publication

(1 citation statement)

References 68 publications

(78 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Machine learning methods in computer vision and image processing problems [37] have been applied for a good deal of research applications (e.g., [38][39][40][41][42][43][44][45][46][47][48][49][50][51]). Deep learning is a subset of machine learning that utilizes huge volumes of data and sophisticated algorithms for training a model.…”

Section: Overview Of the Generalized Architecture (Rpnet)mentioning

confidence: 99%

Deep Crowd Anomaly Detection by Fusing Reconstruction and Prediction Networks

2023

View full text Add to dashboard Cite

Abnormal event detection is one of the most challenging tasks in computer vision. Many existing deep anomaly detection models are based on reconstruction errors, where the training phase is performed using only videos of normal events and the model is then capable to estimate frame-level scores for an unknown input. It is assumed that the reconstruction error gap between frames of normal and abnormal scores is high for abnormal events during the testing phase. Yet, this assumption may not always hold due to superior capacity and generalization of deep neural networks. In this paper, we design a generalized framework (rpNet) for proposing a series of deep models by fusing several options of a reconstruction network (rNet) and a prediction network (pNet) to detect anomaly in videos efficiently. In the rNet, either a convolutional autoencoder (ConvAE) or a skip connected ConvAE (AEc) can be used, whereas in the pNet, either a traditional U-Net, a non-local block U-Net, or an attention block U-Net (aUnet) can be applied. The fusion of both rNet and pNet increases the error gap. Our deep models have distinct degree of feature extraction capabilities. One of our models (AEcaUnet) consists of an AEc with our proposed aUnet has capability to confirm better error gap and to extract high quality of features needed for video anomaly detection. Experimental results on UCSD-Ped1, UCSD-Ped2, CUHK-Avenue, ShanghaiTech-Campus, and UMN datasets with rigorous statistical analysis show the effectiveness of our models.

show abstract