2020
DOI: 10.1007/s00371-020-01864-y
|View full text |Cite
|
Sign up to set email alerts
|

Real-time multimodal ADL recognition using convolution neural networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(7 citation statements)
references
References 40 publications
0
7
0
Order By: Relevance
“…For instance, Couprie et al [214] proposed a bimodal CNN architecture for multiscale feature extraction from RGB-D datasets, which are taken as four-channel frames (blue, green, red, and depth). Similarly, Madhuranga et al [215] used CNN models for video recognition purposes by extracting silhouettes from depth sequences and then fusing the depth information with audio descriptions for activity of daily living (ADL) recognition. Zhang et al [217] proposed to use multicolumn CNNs to extract visual features from the face and eye images for the gaze point estimation problem.…”
Section: Convolutional Neural Network Basedmentioning
confidence: 99%
“…For instance, Couprie et al [214] proposed a bimodal CNN architecture for multiscale feature extraction from RGB-D datasets, which are taken as four-channel frames (blue, green, red, and depth). Similarly, Madhuranga et al [215] used CNN models for video recognition purposes by extracting silhouettes from depth sequences and then fusing the depth information with audio descriptions for activity of daily living (ADL) recognition. Zhang et al [217] proposed to use multicolumn CNNs to extract visual features from the face and eye images for the gaze point estimation problem.…”
Section: Convolutional Neural Network Basedmentioning
confidence: 99%
“…The proposed model has achieved an accuracy of 90.8%. Madhuranga et al [14] employed CNN in-depth images that were previously used for RGB video for silhouette segmentation. They have developed the model for real-time recognition, which is capable of recognizing ADL on commodity computers.…”
Section: Related Workmentioning
confidence: 99%
“…HAR approaches are divided into vision-based and sensor-based. Vision-based HAR systems (Madhuranga et al, 2021) rely on visual sensing technologies (cameras and CCTV) to record human activities. Even though these systems have achieved decent performance in recognizing various activities, they have some drawbacks, like being too expensive, limited area coverage, privacy concerns, and so forth.…”
Section: Introductionmentioning
confidence: 99%