Video analytics using deep learning for crowd analysis: a review

Bhuiyan, Md Roman; Abdullah, Junaidi; Hashim, Noramiza; Farid, Fahmid Al

doi:10.1007/s11042-022-12833-z

Cited by 22 publications

(7 citation statements)

References 74 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For crowd data extraction, existing studies have primarily focused on positioning in the imaging plane [1], with business objectives centred around crowd behaviour analysis, crowd counting, and density estimation [55,56]. Due to misidentifications caused by obstructions and high density in crowd scenarios, and combined with the three problems C1~C3, the complexity has been raised, and no research currently addresses all three problems simultaneously.…”

Section: Comparison With Related Workmentioning

confidence: 99%

Joint Solution for Temporal-Spatial Synchronization of Multi-View Videos and Pedestrian Matching in Crowd Scenes

Yang,

Guo

2023

View full text Add to dashboard Cite

The study of crowd movement and behavioral patterns typically relies on spatio-temporal localization data of pedestrians. While monocular cameras serve the purpose, industrial binocular cameras based on multi-view geometry offer heightened spatial accuracy. These cameras synchronize time through circuits and are calibrated for external parameters after fixing their relative positions. Yet, the flexibility and real-time adaptability of using two different cameras or smartphones in close proximity, forming a short-baseline binocular camera, presents challenges in camera time synchronization, external parameter calibration, and pedestrian feature matching. A method is introduced herein for jointly addressing these challenges. Images are abstracted into spatial-temporal point sets based on human head coordinates and frame numbers. Through point set registration, time synchronization and pedestrian matching are achieved concurrently, followed by the calibration of the shortbaseline camera's external parameters. Numerical results from synthetic and real-world scenarios indicate the proposed model's capability in addressing the aforementioned fundamental challenges. With the sole reliance on crowd image data, devoid of external hardware, software, or manual calibrations, time synchronization precision reaches the submillisecond level, pedestrian matching averages a 92% accuracy rate, and the camera's external parameters align with the calibration board's precision. Ultimately, this research facilitates the self-calibration, automatic time synchronization, and pedestrian matching tasks for short-baseline camera assemblies observing crowds.

show abstract

Section: Comparison With Related Workmentioning

confidence: 99%

Joint Solution for Temporal-Spatial Synchronization of Multi-View Videos and Pedestrian Matching in Crowd Scenes

Yang,

Guo

2023

View full text Add to dashboard Cite

show abstract

“…Video analytics refers to generating descriptions of the content of, or events in the video, which involves tasks of object (persons, cars, or other objects) detection, tracking, as well as calculating their appearance and movements. It is also an important and essential computer vision technique and has significant practical benefits such as monitoring video for security incidents helps prevent crime, intelligent traffic systems, and more [ 270 , 271 ]. While its tasks overlap beyond image analysis tasks, they are more challenging because they involve both spatial and temporal information.…”

Section: Deep Learning In Diverse Intelligent Sensor Based Systemsmentioning

confidence: 99%

“…Other representative RCNN based models include the one proposed by Ballas et al [ 273 ], MaskRNN [ 274 ], and MoNet [ 275 ]. For comprehensive discussions of video analytics we refer to recent surveys [ 270 , 271 , 276 , 277 ].…”

Section: Deep Learning In Diverse Intelligent Sensor Based Systemsmentioning

confidence: 99%

Deep Learning in Diverse Intelligent Sensor Based Systems

Zhu

Wang

Yin

et al. 2022

Sensors

View full text Add to dashboard Cite

Deep learning has become a predominant method for solving data analysis problems in virtually all fields of science and engineering. The increasing complexity and the large volume of data collected by diverse sensor systems have spurred the development of deep learning methods and have fundamentally transformed the way the data are acquired, processed, analyzed, and interpreted. With the rapid development of deep learning technology and its ever-increasing range of successful applications across diverse sensor systems, there is an urgent need to provide a comprehensive investigation of deep learning in this domain from a holistic view. This survey paper aims to contribute to this by systematically investigating deep learning models/methods and their applications across diverse sensor systems. It also provides a comprehensive summary of deep learning implementation tips and links to tutorials, open-source codes, and pretrained models, which can serve as an excellent self-contained reference for deep learning practitioners and those seeking to innovate deep learning in this space. In addition, this paper provides insights into research topics in diverse sensor systems where deep learning has not yet been well-developed, and highlights challenges and future opportunities. This survey serves as a catalyst to accelerate the application and transformation of deep learning in diverse sensor systems.

show abstract

“…Nowadays, artificial intelligence (AI) video analytics systems are used extensively in multiple real-life applications, including self-driving cars [1,2], surveillance [3], medicine [4], document recognition [5] and agriculture [6]. In the pursuit of high recognition accuracy, the core components of these systems -deep neural networks (DNNs) -have become more and more computationally complex [7].…”

Section: Introductionmentioning

confidence: 99%

Efficient single- and multi-DNN inference using TensorRT framework

Zhdanovskiy,

Teplyakov,

Belyaev

2024

Sixteenth International Conference on Machine Vision (ICMV 2023)

View full text Add to dashboard Cite

In the recent years, there has been a significant growth of interest in real-world systems based on deep neural networks (DNNs). These systems typically incorporate multiple DNNs running simultaneously. In this paper we propose a novel approach of multi-DNN execution on a single GPU using multiple CUDA contexts and TensorRT, state-of-the-art DNN inference framework. We show that it can lead to more efficient scheduling of multiple DNNs, especially in case when a lightweight and a heavy DNNs are inferred together. We show that our approach can provide an almost 7x increase in the throughput of a lightweight DNN at the cost of neglible throughput drop of a heavy DNN, compared to the baseline. Moreover, we compare two ways of improving throughput of a single DNN by processing multiple images together: standard batching and implicit batching by processing multiple images simultaneously using several TensorRT execution contexts. We show that meanwhile standard batching outperforms implicit batching at larger batch sizes, implicit batching can provide up to 43% more throughput for a smaller DNN using smaller batch size.

show abstract

Video analytics using deep learning for crowd analysis: a review

Cited by 22 publications

References 74 publications

Joint Solution for Temporal-Spatial Synchronization of Multi-View Videos and Pedestrian Matching in Crowd Scenes

Joint Solution for Temporal-Spatial Synchronization of Multi-View Videos and Pedestrian Matching in Crowd Scenes

Deep Learning in Diverse Intelligent Sensor Based Systems

Efficient single- and multi-DNN inference using TensorRT framework

Contact Info

Product

Resources

About