Incremental EM for Probabilistic Latent Semantic Analysis on Human Action Recognition

Xu, Jie; Ye, Getian; Wang, Yang; Herman, Gunawan; Zhang, Bang; Yang, Jun

doi:10.1109/avss.2009.66

Cited by 7 publications

(9 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…PLSA is a probabilistic generative model that has sucessfully been applied to action recognition (e.g. [26], [28]), and represents each video recording as a bag of words, decomposed into a probability distribution of words per 'topic' (which represents the action class) and a probability of a topic occurring in the input recording.…”

Section: Naive Bayesmentioning

confidence: 99%

“…Evaluation Recog.rate(%) MD + Gaussian RBM + NB split 85.65 MD + Gaussian RBM + pLSA split 88.89 HST + pLSA [17] l-o-o 83.33 MF + SVM [24] l-o-o 83.31 HST + SVM [19] split 71.72 HST + iEM+PLSA [26] l-o-o 82.33 LST + SVM [6] split 86.6 SD + S-LDA [24] l-o-o 91.20 LST + SVM [14] split 93.9 [17], [24], [19], [26], [6], [14] ARE COPIED FROM THE ORIGINAL PAPERS are reported along with results from [17], [24], [19], [26], [6], [14] as in Table II. Despite the simplicity, our approach is able to achieve good performance among state-of-the-art approaches.…”

Section: Approachmentioning

confidence: 99%

“…However, action recognition is still an open problem due to numerous associated challenges, including camera motion, occlusion, and cluttered background [26].…”

Section: Introductionmentioning

confidence: 99%

“…Schuldt et al [19] use local space-time features with support vector machines (SVMs) to recognize human actions in the proposed KTH dataset. Xu et al [26] employed the incremental expectation-maximization algorithm to improve PLSA performance for action classification with the use of colourcoded space-time features. Niebles et al [17] utilized spatiotemporal features to generate visual words and performed action modelling using PLSA.…”

Section: Introductionmentioning

confidence: 99%

“…Despite its simplicity, the motion-difference method achieved very good performance in benchmark dataset such as the Weizmann dataset (98.81%) and the KTH dataset (88.89%). A comparative analysis with other features using simple classifiers such as NaiveBayes and PLSA indicates that the motion-difference with Gaussian RBM is competitive with shape descriptor [28], motion descriptor [24], hand-crafted spatio-temporal features [17], [19], [26], and spatio-temporal features learned by Deep Networks [6]. In addition to good performance, our approach shows more efficicent than many other approaches [28], [24], [17], [19], [26], [6], [14], [16].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Learning motion-difference features using Gaussian restricted Boltzmann machines for efficient human action recognition

Tran

Benetos

Garcez

2014

2014 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

This is the accepted version of the paper.This version of the publication may differ from the final published version. Abstract-Learning visual words from video frames is challenging because deciding which word to assign to each subset of frames is a difficult task. For example, two similar frames may have different meanings in describing human actions such as starting to run and starting to walk. In order to associate richer information to vector-quantization and generate visual words, several approaches have been proposed recently that use complex algorithms to extract or learn spatio-temporal features from 3-D volumes of video frames. In this paper, we propose an efficient method to use Gaussian RBM for learning motiondifference features from actions in videos. The difference between two video frames is defined by a subtraction function of one frame by another that preserves positive and negative changes, thus creating a simple spatio-temporal saliency map for an action. This subtraction function removes, by construction, the common shapes and background images that should not be relevant for action learning and recognition, and highlights the movement patterns in space, making it easier to learn the actions from such saliency maps using shallow feature learning models such as RBMs. In the experiments reported in this paper, we used a Gaussian restricted Boltzmann machine to learn the actions from saliency maps of different motion images. Despite its simplicity, the motion-difference method achieved very good performance in benchmark datasets, specifically the Weizmann dataset (98.81%) and the KTH dataset (88.89%). A comparative analysis with hand-crafted and learned features using similar classifiers indicates that motion-difference can be competitive and very efficient. Permanent repository link

show abstract

Section: Naive Bayesmentioning

confidence: 99%

Section: Approachmentioning

confidence: 99%

“…However, action recognition is still an open problem due to numerous associated challenges, including camera motion, occlusion, and cluttered background [26].…”

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%