Human action recognition using fusion of features for unconstrained video sequences

Kishore

Mathematical Problems in Engineering

2017

Extracting and recognizing complex human movements from unconstraint online video sequence is an interesting task. In this paper the complicated problem from the class is approached using unconstraint video sequences belonging to Indian classical dance forms. A new segmentation model is developed using discrete wavelet transform and local binary pattern (LBP) features for segmentation. A 2D point cloud is created from the local human shape changes in subsequent video frames. The classifier is fed with 5 types of features calculated from Zernike moments, Hu moments, shape signature, LBP features, and Haar features. We also explore multiple feature fusion models with early fusion during segmentation stage and late fusion after segmentation for improving the classification process. The extracted features input the Adaboost multiclass classifier with labels from the corresponding song (tala). We test the classifier on online dance videos and on an Indian classical dance dataset prepared in our lab. The algorithms were tested for accuracy and correctness in identifying the dance postures.

Section: Proposed Methodologymentioning

confidence: 99%

Indian Classical Dance Classification with Adaboost Multiclass Classifier on Multifeature Fusion

Kishore

Mathematical Problems in Engineering

2017

Turk J Elec Eng & Comp Sci

“…Signer identification, signer extraction, global and local shape feature extraction, and the classifier modules form the system. Further feature fusion concept from [18] is utilized in this work with two feature types, made from LBP features and Haar features. Back propagation algorithm explores the relativity between the query sign sequence and the known dataset.…”

Section: Proposed Methodologymentioning

confidence: 99%

Sign language recognition with multi feature fusion and ANN classifier

Ravi¹,

Maloji²,

Kishore³

et al. 2018

Extracting and recognizing complex human movements such as sign language gestures from video sequences is a challenging task. In this paper this kind of a difficult problem is approached with Indian sign language (ISL) videos.A new segmentation algorithm is developed by fusion of features from discrete wavelet transform (DWT) and local binary pattern (LBP). A 2D point cloud is formed from fused features, which represent the local hand shapes in consecutive video frames. We validate the proposed feature extraction model with state of the art features such as HOG, SIFT and SURF for each sign video on the same ANN classifier. We found that the Haar-LBP fused features represent sign video data in better manner compared to HOG, SIFT and SURF. This is due to the combination of global and local features in our proposed feature matrix. The extracted features input the artificial neural network (ANN) classifier with labels forming the corresponding words. The proposed ANN classifier is tested against state of the art classifiers such as Adaboost, support vector machine (SVM) and other ANN methods on different features extracted from the ISL dataset.The classifiers were tested for accuracy and correctness in identifying the signs. The ANN classifier that produced a recognition rate of 92.79% was obtained with maximum training instances, which was far greater than the existing works on sign language with other features and ANN classifier on our ISL dataset.

“…During the process of feature extraction to display action, a combination of contour-based distance signal feature, flow-based motion feature [12], [14], and uniform rotation local binary patterns can be used to define region of interest for feature extraction [15], [16], [17], [22]. Therefore, at this stage, suitable regions for extraction of the feature are determined.…”

Section: D) Roi Calculationmentioning

confidence: 99%

Complex Human Action Recognition in Live Videos Using Hybrid FR-DL Method

Serpush¹,

Rezaei²

2020

Preprint

Automated human action recognition is one of the most attractive and practical research fields in computer vision, in spite of its high computational costs. In such systems, the human action labelling is based on the appearance and patterns of the motions in the video sequences; however, the conventional methodologies and classic neural networks cannot use temporal information for action recognition prediction in the upcoming frames in a video sequence. On the other hand, the computational cost of the preprocessing stage is high. In this paper, we address challenges of the preprocessing phase, by an automated selection of representative frames among the input sequences. Furthermore, we extract the key features of the representative frame rather than the entire features. We propose a hybrid technique using background subtraction and HOG, followed by application of a deep neural network and skeletal modelling method. The combination of a CNN and the LSTM recursive network is considered for feature selection and maintaining the previous information, and finally, a Softmax-KNN classifier is used for labelling human activities. We name our model as "Feature Reduction \& Deep Learning" based action recognition method, or FR-DL in short. To evaluate the proposed method, we use the UCF dataset for the benchmarking which is widely-used among researchers in action recognition research. The dataset includes 101 complicated activities in the wild. Experimental results show a significant improvement in terms of accuracy and speed in comparison with six state-of-the-art articles.