Video anomaly detection is challenging because abnormal events are unbounded, rare, equivocal, irregular in real scenes. In recent years, transformers have demonstrated powerful modelling abilities for sequence data. Thus, we attempt to apply transformers to video anomaly detection. In this paper, we propose a prediction-based video anomaly detection approach named TransAnomaly. Our model combines the U-Net and the Video Vision Transformer (ViViT) to capture richer temporal information and more global contexts. To make full use of the ViViT for the prediction, we modified the ViViT to make it capable of video prediction. Experiments on benchmark datasets show that the addition of the transformer module improves the anomaly detection performance. In addition, we calculate regularity scores with sliding windows and evaluate the impact of different window sizes and strides. With proper settings, our model outperforms other state-of-the-art prediction-based video anomaly detection approaches. Furthermore, our model can perform anomaly localization by tracking the location of patches with lower regularity scores.
Accurate locating of the weld seam under strong noise is the biggest challenge for automated welding. In this paper, we construct a robust seam detector on the framework of deep learning object detection algorithm. The representative object algorithm, a single shot multibox detector (SSD), is studied to establish the seam detector framework. The improved SSD is applied to seam detection. Under the SSD object detection framework, combined with the characteristics of the seam detection task, the multifeature combination network (MFCN) is proposed. The network comprehensively utilizes the local information and global information carried by the multilayer features to detect a weld seam and realizes the rapid and accurate detection of the weld seam. To solve the problem of single-frame seam image detection algorithm failure under continuous super-strong noise, the sequence image multifeature combination network (SMFCN) is proposed based on the MFCN detector. The recurrent neural network (RNN) is used to learn the temporal context information of convolutional features to accurately detect the seam under continuous super-noise. Experimental results show that the proposed seam detectors are extremely robust. The SMFCN can maintain extremely high detection accuracy under continuous super-strong noise. The welding results show that the laser vision seam tracking system using the SMFCN can ensure that the welding precision meets industrial requirements under a welding current of 150 A.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.