The development of technology generates huge amounts of non-textual information, such as images. An efficient image annotation and retrieval system is highly desired. Clustering algorithms make it possible to represent visual features of images with finite symbols. Based on this, many statistical models, which analyze correspondence between visual features and words and discover hidden semantics, have been published. These models improve the annotation and retrieval of large image databases. However, current state of the art including our previous work produces too many irrelevant keywords for images during annotation. In this paper, we propose a novel approach that augments the classical model with generic knowledge-based, WordNet. Our novel approach strives to prune irrelevant keywords by the usage of WordNet. To identify irrelevant keywords, we investigate various semantic similarity measures between keywords and finally fuse outcomes of all these measures together to make a final decision using Dempster-Shafer evidence combination. We have implemented various models to link visual tokens with keywords based on knowledge-based, WordNet and evaluated performance using precision, and recall using benchmark dataset. The results show that by augmenting knowledge-based with classical model we can improve annotation accuracy by removing irrelevant keywords.
Recently, images on the Web and personal computers are prevalent around the human's life. To retrieve effectively these images, there are many (Automatic Image Annotation) AIA algorithms. However, it still suffers from low-level accuracy since it couldn't overcome the semantic-gap between low-level features ('color', 'texture' and 'shape') and high-level semantic meanings (e.g., 'sky', 'beach'). Namely, AIA techniques annotates images with many noisy keywords. In this paper, we propose a novel approach that augments the classical model with generic knowledge-based, WordNet. Our novel approach strives to prune irrelevant keywords by the usage of WordNet. To identify irrelevant keywords, we investigate various semantic similarity measures between keywords and finally fuse outcomes of all these measures together to make a final decision using Dempster-Shafer evidence combination. Furthermore, We can re-formulate the removal of erroneous keywords from image annotation problem into graph-partitioning problem, which is weighted MAX-CUT problem. It is possible that we have too many candidate keywords for web-images. Hence, we need to have deterministic polynomial time algorithm for MAX-CUT problem. We show that finding optimal solution for removing noisy keywords in the graph is NP-Complete problem and propose a new methodology for Knowledge Based Image Annotation Refinement (KBIAR) using a deterministic polynomial time algorithm, namely, randomized approximation graph algorithm. Finally, we demonstrate the superiority of this algorithm over traditional one including the most recent work for a benchmark dataset.
Human motion recognition in video data has several interesting applications in fields such as gaming, senior/assisted-living environments, and surveillance. In these scenarios, we may have to consider adding new motion classes (i.e., new types of human motions to be recognized), as well as new training data (e.g., for handling different type of subjects). Hence, both the accuracy of classification and training time for the machine learning algorithms become important performance parameters in these cases. In this article, we propose a knowledge-based hybrid (KBH) method that can compute the probabilities for hidden Markov models (HMMs) associated with different human motion classes. This computation is facilitated by appropriately mixing features from two different media types (3D motion capture and 2D video). We conducted a variety of experiments comparing the proposed KBH for HMMs and the traditional Baum-Welch algorithms. With the advantage of computing the HMM parameter in a noniterative manner, the KBH method outperforms the Baum-Welch algorithm both in terms of accuracy as well as in reduced training time. Moreover, we show in additional experiments that the KBH method also outperforms the linear support vector machine (SVM). ACM Reference Format:Suk, M., Ramadass, A., Jin, Y., and Prabhakaran, B. 2012. Video human motion recognition using a knowledge-based hybrid method based on a hidden Markov model.
3D human motion capture is a form of multimedia data that is widely used in entertainment as well as medical fields (such as orthopedics, physical medicine, and rehabilitation where gait analysis is needed). These applications typically create large repositories of motion capture data and need efficient and accurate content-based retrieval techniques. 3D motion capture data is in the form of multidimensional time-series data. To reduce the dimensions of human motion data while maintaining semantically important features, we quantize human motion data by extracting spatio-temporal features through SVD and translate them onto a symbolic sequential representation through our proposed sGMMEM (semantic Gaussian Mixture Modeling with EM). In order to handle variations in motion capture data due to human body characteristics and speed of motion, we transform the semantically quantized values into a histogram representation. This representation is used as a signature for classification and similarity-based retrieval. We achieved good classification accuracies for “coarse” human motion categories (such as walking 92.85%, run 91.42%, and jump 94.11%) and even for subtle categories (such as dance 89.47%, laugh 83.33%, basketball signal 85.71%, golf putting 80.00%). Experiments also demonstrated that the proposed approach outperforms earlier techniques such as the wMSV (weighted Motion Singular Vector) approach and LB_Keogh method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.