Anomaly event detection in crowd scenes is extremely important; however, the majority of existing studies merely use hand-crafted features to detect anomalies. In this study, a novel unsupervised deep learning framework is proposed to detect anomaly events in crowded scenes. Specifically, low-level visual features, energy features, and motion map features are simultaneously extracted based on spatiotemporal energy measurements. Three convolutional restricted Boltzmann machines are trained to model the mid-level feature representation of normal patterns. Then a multimodal fusion scheme is utilized to learn the deep representation of crowd patterns. Based on the learned deep representation, a one-class support vector machine model is used to detect anomaly events. The proposed method is evaluated using two available public datasets and compared with state-of-the-art methods. The experimental results show its competitive performance for anomaly event detection in video surveillance.
Measurement of image similarity is important for a number of image processing applications. Image similarity assessment is closely related to image quality assessment and is based on the apparent differences between a degraded image and the original, unmodified i~age. Automated evaluation of image retrieval systems relies on accurate quality measurement of similarity among the input image and the database images. In this paper, we have treated the image under pixel level where we used the mean squared error (MSE) algorithms for measuring similarity between the input image and the training data images, The mean squared error (MSE) simulations have demonstrated its promise through a set of examples by showing its accuracy and low computation cost, though it didn't show good results with precision on similarity of images and for rotated, translated or flipped images hence we proposed the use of minimum circumscribed circle (concentric circles) with local binary pattern and compare the results to get the best image similarity method. Hence comparing different results some improved performance were observed.
Video event detection is a challenging problem in many applications, such as video surveillance and video content analysis. In this paper, we propose a new framework to perceive high-level codewords by analyzing temporal relationship between different channels of video features. The low-level vocabulary words are firstly generated after different audio and visual feature extraction. A weighted undirected graph is constructed by exploring the Granger Causality between low-level words. Then, a greedy agglomerative graph-partitioning method is used to discover low-level word groups which have similar temporal pattern. The high-level codebooks representation is obtained by quantification of low-level words groups. Finally, multiple kernel learning, combined with our high-level codewords, is used to detect the video event. Extensive experimental results show that the proposed method achieves preferable results in video event detection.
Over the past decades, crowd management has attracted a great deal of attention in the area of video surveillance. Among various tasks of video surveillance analysis, crowd motion analysis is the basis of numerous subsequent applications of surveillance video. In this paper, a novel social force graph with streak flow attribute is proposed to capture the global spatiotemporal changes and the local motion of crowd video. Crowd motion analysis is hereby implemented based on the characteristics of social force graph. First, the streak flow of crowd sequence is extracted to represent the global crowd motion; after that, spatiotemporal analogous patches are obtained based on the crowd visual features. A weighted social force graph is then constructed based on multiple social properties of crowd video. The graph is segmented into particle groups to represent the similar motion patterns of crowd video. A codebook is then constructed by clustering all local particle groups, and consequently crowd abnormal behaviors are detected by using the Latent Dirichlet Allocation model. Extensive experiments on challenging datasets show that the proposed method achieves preferable results in the application of crowd motion segmentation and abnormal behavior detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.