Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413973
|View full text |Cite
|
Sign up to set email alerts
|

Cloze Test Helps: Effective Video Anomaly Detection via Learning to Complete Video Events

Abstract: As a vital topic in media content interpretation, video anomaly detection (VAD) has made fruitful progress via deep neural network (DNN). However, existing methods usually follow a reconstruction or frame prediction routine. They suffer from two gaps: (1) They cannot localize video activities in a both precise and comprehensive manner. (2) They lack sufficient abilities to utilize high-level semantics and temporal context information. Inspired by frequently-used cloze test in language study, we propose a brand… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
108
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 170 publications
(110 citation statements)
references
References 44 publications
1
108
1
Order By: Relevance
“…Reconstruction of masked information. A surrogate task for many anomaly detection approaches [15,22,37,42,77] is to erase some information from the input, while making neural networks predict the erased information. Haselmann et al [22] framed anomaly detection as an inpainting problem, where patches from images are masked randomly, using the pixel-wise reconstruction error of the masked patches for surface anomaly detection.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Reconstruction of masked information. A surrogate task for many anomaly detection approaches [15,22,37,42,77] is to erase some information from the input, while making neural networks predict the erased information. Haselmann et al [22] framed anomaly detection as an inpainting problem, where patches from images are masked randomly, using the pixel-wise reconstruction error of the masked patches for surface anomaly detection.…”
Section: Related Workmentioning
confidence: 99%
“…In turn, ARNet learns to restore the original image and detect anomalies based on the assumption that normal images can be restored properly. The Cloze task [42] learns to complete a video when certain frames are removed, being recently employed by Yu et al [77] for anomaly detection. In a similar direction, Georgescu et al [17] proposed middle frame masking as one of the auxiliary tasks for video anomaly detection.…”
Section: Related Workmentioning
confidence: 99%
“…In contrast, our approach encourages AEs to produce unconstrained reconstructions for normal inputs while limiting the reconstructions for anomalous inputs, thus producing more discriminative anomaly scores. Non-Reconstruction Methods: Several researchers adopt different schemes for OCC based anomaly detection: focusing only on objects by utilizing object detectors in the frameworks [6,7,8,11,12,43,52]; predicting future frames from the past few consecutive frames with the intuition that it is difficult to predict unseen anomalous data [5,24,27,28,35]; or incorporating adversarial components [14,19,20,24,39,45]. Our approach is different as we do not utilize any additional component and solely rely on the reconstruction based AEs.…”
Section: Related Workmentioning
confidence: 99%
“…In spite of the growing interest in video anomaly detection [9, 10, 14-16, 19-21, 24, 29, 31, 36-38, 40, 43, 49, 51, 57, 58, 61, 63], which generated significant advances leading to impressive performance levels [14,15,18,24,29,53,56,57,61,63,64], the task remains very challenging. The difficulty of the task stems from two interdependent aspects: (i) the reliance on context of anomalies, and (ii) the lack of abnormal training data.…”
Section: Introductionmentioning
confidence: 99%