A Vision-based Human Action Recognition System for Moving Cameras Through Deep Learning

Chang, Ming‐Jen; Hsieh, Jih-Tang; Fang, Chiung‐Yao; Chen, Sei-Wang

doi:10.1145/3372806.3372815

Cited by 10 publications

(7 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, they estimated the liquid intake volume by assuming that each intake sip was constantly 100 mL, leading to a poor estimate that was not validated [ 103 ]. Chang et al used deep learning model trained with video and depth stream from a Kinect camera to classify several types of human activities, including drinking [ 104 ]. An average accuracy of 96.4% was achieved when combining color, depth and optical flow in a CNN algorithm.…”

Section: Vision- and Environmental-based Methodsmentioning

confidence: 99%

Fluid Intake Monitoring Systems for the Elderly: A Review of the Literature

2021

View full text Add to dashboard Cite

Fluid intake monitoring is an essential component in preventing dehydration and overhydration, especially for the senior population. Numerous critical health problems are associated with poor or excessive drinking such as swelling of the brain and heart failure. Real-time systems for monitoring fluid intake will not only measure the exact amount consumed by the users, but could also motivate people to maintain a healthy lifestyle by providing feedback to encourage them to hydrate regularly throughout the day. This paper reviews the most recent solutions to automatic fluid intake monitoring both commercially and in the literature. The available technologies are divided into four categories: wearables, surfaces with embedded sensors, vision- and environmental-based solutions, and smart containers. A detailed performance evaluation was carried out considering detection accuracy, usability and availability. It was observed that the most promising results came from studies that used data fusion from multiple technologies, compared to using an individual technology. The areas that need further research and the challenges for each category are discussed in detail.

show abstract

Section: Vision- and Environmental-based Methodsmentioning

confidence: 99%

Fluid Intake Monitoring Systems for the Elderly: A Review of the Literature

2021

View full text Add to dashboard Cite

show abstract

“…In a related survey, a vision-based human action recognition system using a deeplearning technique is proposed by Chang et al [16], which can recognise human actions by retrieving information from colour videos, optical flow videos, and depth videos from the camera. This core research of HAR is not focused on the classroom; rather, it is based on activities in an indoor environment.…”

Section: Related Studymentioning

confidence: 99%

STAR-3D: A Holistic Approach for Human Activity Recognition in the Classroom Environment

Sharma,

Gupta,

Kumar

et al. 2024

Information

View full text Add to dashboard Cite

The video camera is essential for reliable activity monitoring, and a robust analysis helps in efficient interpretation. The systematic assessment of classroom activity through videos can help understand engagement levels from the perspective of both students and teachers. This practice can also help in robot-assistive classroom monitoring in the context of human–robot interaction. Therefore, we propose a novel algorithm for student–teacher activity recognition using 3D CNN (STAR-3D). The experiment is carried out using India’s indigenously developed supercomputer PARAM Shivay by the Centre for Development of Advanced Computing (C-DAC), Pune, India, under the National Supercomputing Mission (NSM), with a peak performance of 837 TeraFlops. The EduNet dataset (registered under the trademark of the DRSTATM dataset), a self-developed video dataset for classroom activities with 20 action classes, is used to train the model. Due to the unavailability of similar datasets containing both students’ and teachers’ actions, training, testing, and validation are only carried out on the EduNet dataset with 83.5% accuracy. To the best of our knowledge, this is the first attempt to develop an end-to-end algorithm that recognises both the students’ and teachers’ activities in the classroom environment, and it mainly focuses on school levels (K-12). In addition, a comparison with other approaches in the same domain shows our work’s novelty. This novel algorithm will also influence the researcher in exploring research on the “Convergence of High-Performance Computing and Artificial Intelligence”. We also present future research directions to integrate the STAR-3D algorithm with robots for classroom monitoring.

show abstract

“…The fusion of modalities like RGB and depth information further refines recognition. Recent strides in attention mechanisms and metaheuristic algorithms have optimized network architectures, emphasizing relevant regions for improved performance [9,[61][62][63][64][65][66][67][68][69][70][71][72].…”

Section: Human Action Recognitionmentioning

confidence: 99%

A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision

Manakitsa,

Maraslidis,

Moysis

et al. 2024

Technologies

View full text Add to dashboard Cite

Machine vision, an interdisciplinary field that aims to replicate human visual perception in computers, has experienced rapid progress and significant contributions. This paper traces the origins of machine vision, from early image processing algorithms to its convergence with computer science, mathematics, and robotics, resulting in a distinct branch of artificial intelligence. The integration of machine learning techniques, particularly deep learning, has driven its growth and adoption in everyday devices. This study focuses on the objectives of computer vision systems: replicating human visual capabilities including recognition, comprehension, and interpretation. Notably, image classification, object detection, and image segmentation are crucial tasks requiring robust mathematical foundations. Despite the advancements, challenges persist, such as clarifying terminology related to artificial intelligence, machine learning, and deep learning. Precise definitions and interpretations are vital for establishing a solid research foundation. The evolution of machine vision reflects an ambitious journey to emulate human visual perception. Interdisciplinary collaboration and the integration of deep learning techniques have propelled remarkable advancements in emulating human behavior and perception. Through this research, the field of machine vision continues to shape the future of computer systems and artificial intelligence applications.

show abstract

A Vision-based Human Action Recognition System for Moving Cameras Through Deep Learning

Cited by 10 publications

References 6 publications

Fluid Intake Monitoring Systems for the Elderly: A Review of the Literature

Fluid Intake Monitoring Systems for the Elderly: A Review of the Literature

STAR-3D: A Holistic Approach for Human Activity Recognition in the Classroom Environment

A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision

Contact Info

Product

Resources

About