2022
DOI: 10.1007/978-3-030-98012-2_26
|View full text |Cite
|
Sign up to set email alerts
|

Efficiency in Human Actions Recognition in Video Surveillance Using 3D CNN and DenseNet

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(11 citation statements)
references
References 27 publications
0
11
0
Order By: Relevance
“…Extracting spatiotemporal features from three consecutive frames can be applied to recognize human actions, as was proposed by Huillcen et al [ 29 ]. This method produces better efficiency results but still fails to surpass the state-of-the-art proposals in terms of effectiveness.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Extracting spatiotemporal features from three consecutive frames can be applied to recognize human actions, as was proposed by Huillcen et al [ 29 ]. This method produces better efficiency results but still fails to surpass the state-of-the-art proposals in terms of effectiveness.…”
Section: Related Workmentioning
confidence: 99%
“…An improvement to the previous approach in terms of efficiency was presented by Huillcen et al [ 44 ]. It uses a DenseNet architecture but with different configurations of dense layers and dense blocks to ensure the compactness of the model.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The codebook is then fed into a Recurrent Neural Network (RNN) with an (LSTM) classifier for sequential input and classification. A new real-time model for recognising human violence using DL has been proposed [28]. The model is made up of two modules: a spatial attention module that identifies spatial features and regions of interest using frame difference between consecutive frames and morphological dilation and a temporal attention module that identifies temporal features by averaging the RGB channels to a single channel and inputting three frames into a 2D CNN backbone.…”
Section: Related Workmentioning
confidence: 99%
“…The methodologies mentioned in [5][6][7][8][9][17][18][19][20]22,25,26,28] faced a shared challenge concerning the integration of new models into the existing framework. These methods require the existing models to be retrained from scratch, resulting in substantial demands on computational resources and time.…”
mentioning
confidence: 99%