In this research work, we propose a method for human action recognition based on the combination of structural and temporal features. The pose sequence in the video is considered to identify the action type. The structural variation features are obtained by detecting the angle made between the joints during the action, where the angle binning is performed using multiple thresholds. The displacement vector of joint locations is used to compute the temporal features. The structural variation features and the temporal variation features are fused using a neural network to perform action classification. We conducted the experiments on different categories of datasets, namely, KTH, UTKinect, and MSR Action3D datasets. The experimental results exhibit the superiority of the proposed method over some of the existing state-of-the-art techniques.
In this paper, a video retrieval model is developed based on Kirsch local descriptor. In the first stage, the input video is segmented into shots and keyframes are extracted. In the next stage, local descriptors are extracted from each keyframe and clustered into clusters using -means clustering procedure. Given a query frame, the local descriptors are extracted from it in a similar manner, and then compared with the descriptors of the database video using -nearest neighbor search algorithm to find the matching keyframe. Experiments have been performed on the TRECVID video segments to demonstrate the performance of the proposed approach for video retrieval applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.