For analyzing the traffic anomaly within dashcam videos from the perspective of ego-vehicles, the agent should spatial-temporally localize the abnormal occasion and regions and give a semantically recounting of what happened. Most existing formulations concentrate on the former spatial-temporal aspect and mainly approach this goal by training normal pattern classifiers/regressors/dictionaries with large-scale availably labeled data. However, anomalies are context-related, and it is difficult to distinguish the margin of abnormal and normal clearly. This paper proposes a progressive unsupervised driving anomaly detection and recounting (D&R) framework. The highlights are three-fold: (1) We formulate driving anomaly D&R as a temporal-spatial-semantic (TSS) model, which achieves a coarse-to-fine focusing and generates convincing driving anomaly D&R. (2) This work contributes an unsupervised D&R without any training data while performing an effective performance. (3) We novelly introduce the traffic saliency, isolation forest, visual semantic causal relations of driving scene to effectively construct the TSS model. Extensive experiments on a driving anomaly dataset with 106 video clips (temporal-spatial-semantically labeled carefully by ourselves) demonstrate superior performance over existing techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.