2023
DOI: 10.1109/access.2023.3263155
|View full text |Cite
|
Sign up to set email alerts
|

Human Activity Recognition Based on Deep-Temporal Learning Using Convolution Neural Networks Features and Bidirectional Gated Recurrent Unit With Features Selection

Abstract: Recurrent Neural Networks (RNNs) and their variants have been demonstrated tremendous successes in modeling sequential data such as audio processing, video processing, time series analysis, and text mining. Inspired by these facts, we propose human activity recognition technique to proceed visual data via utilizing convolution neural network (CNN) and Bidirectional-gated recurrent unit (Bi-GRU). Firstly, we extract deep features from frames sequence of human activities videos using CNN and then select most imp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 21 publications
(8 citation statements)
references
References 49 publications
0
8
0
Order By: Relevance
“…The residual network has a strong feature learning ability and adapts to the characteristics of the backbone convolutional network architecture. In this study, ResNet-50 is adopted as the backbone network to solve the network degradation problem brought by fewer samples of abstract paintings to simplify the model training parameters of this article to a certain extent, improve the training efficiency, and carry out comparative experiments with the residual network variant in the ablation experiments, and to assess the influence of the backbone network on the model accuracy rate (Ahmad et al, 2023 ). The abstract painting dataset is generated by the backbone network to generate canonical image features with a length and width of 7 and a channel count of 256 and is spread into a one-dimensional sequence, resulting in an image sequence of length 49 and a channel count of 256 to be fed to the encoder.…”
Section: Attention Givenmentioning
confidence: 99%
“…The residual network has a strong feature learning ability and adapts to the characteristics of the backbone convolutional network architecture. In this study, ResNet-50 is adopted as the backbone network to solve the network degradation problem brought by fewer samples of abstract paintings to simplify the model training parameters of this article to a certain extent, improve the training efficiency, and carry out comparative experiments with the residual network variant in the ablation experiments, and to assess the influence of the backbone network on the model accuracy rate (Ahmad et al, 2023 ). The abstract painting dataset is generated by the backbone network to generate canonical image features with a length and width of 7 and a channel count of 256 and is spread into a one-dimensional sequence, resulting in an image sequence of length 49 and a channel count of 256 to be fed to the encoder.…”
Section: Attention Givenmentioning
confidence: 99%
“…The efficiency of the proposed approach is due to the fused feature extraction and classification performance. https://www.indjst.org/ (40) UCF101 dataset 93.30 Implicit CNN (41) UCF101 dataset 89.8 GMM-KF-GRNN (42) UCF101 dataset 89.30 AR3D (43) UCF101 dataset 89.28 S3D-ConvNet (44) UCF101 dataset 86.6 Encoding RNNs (45) UCF101 dataset 81.9 P-RRNNs (46) UCF101 dataset 91.4 Deep Bi-LSTM (47) UCF101 dataset 91.21 CNN_Bi-GRU (48) UCF101 dataset 91.79…”
Section: Performance Analysismentioning
confidence: 99%
“…Table 7 indicates the comparative analysis of the proposed HAR model with different DL algorithms. The proposed accuracy outcome is compared with other DL models like Two-stream ConvNet (40) , Implicit CNN (41) , Gaussian Mixture Model-Kalman Filter-Gated RNN (GMM-KF-GRNN) (42) , Attention Residual 3D (AR3D) Network (43) , Segments based 3D ConvNet (S3D-ConvNet) (44) , Encoding RNNs (45) , P-RRNNs (46) , Deep Bi-LSTM (47) , and CNN_Bi-GRU (48) . The same UCF101 dataset is used by the existing DL models for accuracy performance comparison.…”
Section: Performance Analysismentioning
confidence: 99%
“…In paper [21] they developed a privacy-protecting face recognition system via federated learning, in which facial recognition models were trained collaboratively without sharing raw data. They secured accurate face recognition while keeping users' identities secret [38][39][40][41]. One popular approach proposed automatically segmenting and labeling single-channel or multimodal biosignal ta using a self-similarity matrix (SSM) computed with signals' feature-based representations [42].…”
Section: Related Workmentioning
confidence: 99%