This review article surveys extensively the current progresses made toward video-based human activity recognition. Three aspects for human activity recognition are addressed including core technology, human activity recognition systems, and applications from low-level to high-level representation. In the core technology, three critical processing stages are thoroughly discussed mainly: human object segmentation, feature extraction and representation, activity detection and classification algorithms. In the human activity recognition systems, three main types are mentioned, including single person activity recognition, multiple people interaction and crowd behavior, and abnormal activity recognition. Finally the domains of applications are discussed in detail, specifically, on surveillance environments, entertainment environments and healthcare systems. Our survey, which aims to provide a comprehensive state-of-the-art review of the field, also addresses several challenges associated with these systems and applications. Moreover, in this survey, various applications are discussed in great detail, specifically, a survey on the applications in healthcare monitoring systems. OPEN ACCESSComputers 2013, 2 89
Human action recognition is used in areas such as surveillance, entertainment, and healthcare. This paper proposes a system to recognize both single and continuous human actions from monocular video sequences, based on 3D human modeling and cyclic hidden Markov models (CHMMs). First, for each frame in a monocular video sequence, the 3D coordinates of joints belonging to a human object, through actions of multiple cycles, are extracted using 3D human modeling techniques. The 3D coordinates are then converted into a set of geometrical relational features (GRFs) for dimensionality reduction and discrimination increase. For further dimensionality reduction, k-means clustering is applied to the GRFs to generate clustered feature vectors. These vectors are used to train CHMMs separately for different types of actions, based on the Baum-Welch re-estimation algorithm. For recognition of continuous actions that are concatenated from several distinct types of actions, a designed graphical model is used to systematically concatenate different separately trained CHMMs. The experimental results show the effective performance of our proposed system in both single and continuous action recognition problems. Keywords I. IntroductionHuman action recognition is a growing topic in video analysis and understanding -one of the most popular areas in the community of computer vision -thanks to its applications in surveillance, entertainment, and healthcare. In surveillance, human action recognition can be used in conjunction with video camera footage to help with the recognition and analysis of human actions. In entertainment, human-computer interaction can be helped to appear more natural via human action recognition, which in turn can help increase the entertainment experience. In healthcare, human action recognition can help detect abnormal gaits or assist in a patient's rehabilitation through an analysis of their actions.However, it is challenging to recognize various human actions due to the high number of degrees of freedom associated with the average human body -namely, variations in human poses; variations in the colors of a person's clothing; changes in lighting and illumination; variations in viewpoints; and frequent self-occlusion. Moreover, the use of monocular video sequences further increases the difficulty for human action recognition.Generally, the two main stages in human action recognition are: the feature extraction and representation stage and the classification stage.In the feature extraction and representation stage, the features or characteristics of video frames, such as silhouette, shape, color, and motion, are extracted and represented in a systematic and efficient way. consists of stacking segmented silhouettes (frame by frame) to form a 3D spatial-temporal shape. In a similar way, Ke and others [2] build STVs, for shape-based matching, from image features that are based on the consecutive silhouettes of objects along a time axis, including spatial-temporal region extraction and region matching. Kim and oth...
This paper proposes a system to recognize quasiperiodic human actions from monocular video sequences. First, each input video frame is analyzed and estimated to generate the best 3D human model pose which consists of a set of 3D coordinates of specific human joints. ext, these 3D coordinates for each frame are converted into corresponding 3D geometric relational features (GRFs), which describe the geometric relations among body joints of a pose. Finally, we train a cyclic hidden Markov model (CHMM) for each action based on the vector quantized 3D GRFs, and the trained CHMMs are used to classify different quasi-periodic human actions. The experimental results indicate the effectiveness of the proposed system in terms of the view point invariance, the low-dimensional feature vectors, and the encouraging recognition rates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.