2022
DOI: 10.48550/arxiv.2206.01524
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Anomaly detection in surveillance videos using transformer based attention model

Abstract: Surveillance footage can catch a wide range of realistic anomalies. This research suggests using a weakly supervised strategy to avoid annotating anomalous segments in training videos, which is time consuming. In this approach only video level labels are used to obtain frame level anomaly scores. Weakly supervised video anomaly detection (WSVAD) suffers from the wrong identification of abnormal and normal instances during the training process. Therefore it is important to extract better quality features from t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 28 publications
0
2
0
Order By: Relevance
“…Transformers are more effective at simulating temporal dynamics than more conventional approaches like ConvL-STM and can minimize processing time. Deshpande et al [28] used a three-stage model consisting of a pre-trained videoswin model, an attention layer, and lastly an RFTM model for anomaly detection. TransUNet is a hybrid of U-Net with a transformer encoder, as suggested by Chen et al [29].…”
Section: Transformer Based Methodsmentioning
confidence: 99%
“…Transformers are more effective at simulating temporal dynamics than more conventional approaches like ConvL-STM and can minimize processing time. Deshpande et al [28] used a three-stage model consisting of a pre-trained videoswin model, an attention layer, and lastly an RFTM model for anomaly detection. TransUNet is a hybrid of U-Net with a transformer encoder, as suggested by Chen et al [29].…”
Section: Transformer Based Methodsmentioning
confidence: 99%
“…Transformers are more effective at simulating temporal dynamics than more conventional approaches like ConvLSTM and can minimize processing time. Deshpande et al [28] used a three-stage model consisting of a pre-trained videoswin model, an attention layer, and lastly an RFTM model for anomaly detection. TransUNet is a hybrid of U-Net with a transformer encoder, as suggested by Chen et al [29].…”
Section: Transformer Based Methodsmentioning
confidence: 99%