2020
DOI: 10.1109/access.2020.3029834
|View full text |Cite
|
Sign up to set email alerts
|

Event-Oriented 3D Convolutional Features Selection and Hash Codes Generation Using PCA for Video Retrieval

Abstract: The extensive video surveillance networks gather an enormous amount of data exponentially on a daily basis and its management is a challenging task, requiring efficient and effective techniques for searching, indexing, and retrieval. The employed mainstream techniques are focusing on general category videos, where the important events in surveillance require fine-grained events retrieval. In this paper, we introduce an event-oriented feature selection mechanism by utilizing the intermediate convolutional layer… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 32 publications
0
14
0
Order By: Relevance
“…Thus, recent research applied 3D CNNs that represent video data including temporal information more efficiently. Ullah et al [17] used a pre-trained C3D model [18] on the Sports-1M dataset [24] and reduced the dimensionality by means of PCA to generate the hash code. However, according to work by Hara et al [19], Kataoka et al [20], and Tran et al [29], working with the Sports-1M dataset is not easy because it contains more videos than the Kinetics 400 and Kinetics 700 datasets [30], [31], and their annotations are noisier than those of these Kinetics datasets.…”
Section: A Use Of Cnns For Cbvrmentioning
confidence: 99%
See 2 more Smart Citations
“…Thus, recent research applied 3D CNNs that represent video data including temporal information more efficiently. Ullah et al [17] used a pre-trained C3D model [18] on the Sports-1M dataset [24] and reduced the dimensionality by means of PCA to generate the hash code. However, according to work by Hara et al [19], Kataoka et al [20], and Tran et al [29], working with the Sports-1M dataset is not easy because it contains more videos than the Kinetics 400 and Kinetics 700 datasets [30], [31], and their annotations are noisier than those of these Kinetics datasets.…”
Section: A Use Of Cnns For Cbvrmentioning
confidence: 99%
“…In other words, the ActivityNet dataset has much noisier frames which are not closely related to the video context compared to the UCF101 and HMDB51 datasets. Table 5 shows the mAP results of the proposed PCA-CBVR with the R3D50 feature extractor, which shows the best performance as shown in Figure 5 Methods UCF101 HMDB51 ActivtyNet Event-Oriented Video Retrieval [17] 0.83 0.75 -Stacked HetConv-MK-BiDLSTM [16] --0.17 PCA-CBVR(without fine-tuning) 0.81 0.57 0.35…”
Section: Performance Analysis Depending On Feature Extractors Using D...mentioning
confidence: 99%
See 1 more Smart Citation
“…Our first limitation is that the targeted domain for reviewed papers in this survey is comparatively narrow, i.e., surveillance BVD. A broader version can be the complete video analytics domain, covering action and activity recognition [60], video summarization, [61], video retrieval [7], healthcare [62], objects detection and tracking [63,64], etc. The specific focus of our research is to conduct an in-depth review of fuzzy methods applied to the generic video analysis domain, towards deriving a proper taxonomy of the applied fuzzy techniques.…”
Section: Representative Surveys In Fuzzy Logic and Our Surveymentioning
confidence: 99%
“…As such, the trained models show limited performance in complex smart cities surveillance applications [6]. For improved utilization of surveillance BVD, researchers are investigating several image processing and machine learning algorithms [7].…”
Section: Introductionmentioning
confidence: 99%