Surveillance cameras are being installed in many primary daily living places to maintain public safety. In this video-surveillance context, anomalies occur only for a very short time, and very occasionally. Hence, manual monitoring of such anomalies may be exhaustive and monotonous, resulting in a decrease in reliability and speed in emergency situations due to monitor tiredness. Within this framework, the importance of automatic detection of anomalies is clear, and, therefore, an important amount of research works have been made lately in this topic. According to these earlier studies, supervised approaches perform better than unsupervised ones. However, supervised approaches demand manual annotation, making dependent the system reliability of the different situations used in the training (something difficult to set in anomaly context). In this work, it is proposed an approach for anomaly detection in video-surveillance scenes based on a weakly supervised learning algorithm. Spatio-temporal features are extracted from each surveillance video using a temporal convolutional 3D neural network (T-C3D). Then, a novel ranking loss function increases the distance between the classification scores of anomalous and normal videos, reducing the number of false negatives. The proposal has been evaluated and compared against state-of-art approaches, obtaining competitive performance without fine-tuning, which also validates its generalization capability. In this paper, the proposal design and reliability is presented and analyzed, as well as the aforementioned quantitative and qualitative evaluation in-the-wild scenarios, demonstrating its high sensitivity in anomaly detection in all of them.
This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people’s privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.
Human actions recognition is a fundamental task in artificial vision, that has earned a great importance in recent years due to its multiple applications in different areas. In this context, this paper describes an approach for real-time human action recognition from raw depth image-sequences, provided by an RGB-D camera. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from depth sequences without pre-processing. Furthermore, the described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people's privacy, since their identities can not be recognized from these data. 3DFCNN has been evaluated and its results compared to those from other state-of-the-art methods within three widely used datasets, with different characteristics (resolution, sensor type, number of views, camera location, etc.). The obtained results allows validating the proposal, concluding that it outperforms several state-of-the-art approaches based on classical computer vision techniques. Furthermore, it achieves action recognition accuracy comparable to deep learning based state-of-the-art methods with a lower computational cost, which allows its use in real-time applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.