2018
DOI: 10.1016/j.patrec.2018.04.035
|View full text |Cite
|
Sign up to set email alerts
|

Combining CNN streams of RGB-D and skeletal data for human activity recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
70
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 131 publications
(70 citation statements)
references
References 21 publications
0
70
0
Order By: Relevance
“…In addition, from Table 5 we observe that the proposed method achieves comparable performance compared with method [68] in which learned features from three modalities are fused in feature-level. The method [70] leads to the stateof-the-art performance since it fuses information collected from four types of sensors, i.e., RGB camera, depth sensor, and two wearable inertial sensors (accelerometer and gyroscope data) for action recognition.…”
Section: E Utd-mhad Dataset and Performance Evaluationmentioning
confidence: 88%
“…In addition, from Table 5 we observe that the proposed method achieves comparable performance compared with method [68] in which learned features from three modalities are fused in feature-level. The method [70] leads to the stateof-the-art performance since it fuses information collected from four types of sensors, i.e., RGB camera, depth sensor, and two wearable inertial sensors (accelerometer and gyroscope data) for action recognition.…”
Section: E Utd-mhad Dataset and Performance Evaluationmentioning
confidence: 88%
“…where Figure 4: The steps of the appropriate frame region selection and extraction of the skeleton motion [16], [37] I X and I Y , are the derivative of the image with respect to X and Y , respectively. In order to obtain these derivatives, horizontal and vertical Sobel filters (i.e.…”
Section: B) Bgs and Human Action Identificationmentioning
confidence: 99%
“…During the process of feature extraction to display action, a combination of contour-based distance signal feature, flow-based motion feature [12], [14], and uniform rotation local binary patterns can be used to define region of interest for feature extraction [15], [16], [17], [22]. Therefore, at this stage, suitable regions for extraction of the feature are determined.…”
Section: D) Roi Calculationmentioning
confidence: 99%
“…Khaire et al [14] proposed a ConvNets based approach for activity recognition by combining multiple deep learning methods's cues . A new method of creating skeleton images, from skeleton joint sequences, representing motion information is also presented in this paper.…”
Section: Related Workmentioning
confidence: 99%