Recognizing human activities in partially observed videos is a challenging problem and has many practical applications. When the unobserved subsequence is at the end of the video, the problem is reduced to activity prediction from unfinished activity streaming, which has been studied by many researchers. However, in the general case, an unobserved subsequence may occur at any time by yielding a temporal gap in the video. In this paper, we propose a new method that can recognize human activities from partially observed videos in the general case. Specifically, we formulate the problem into a probabilistic framework: 1) dividing each activity into multiple ordered temporal segments, 2) using spatiotemporal features of the training video samples in each segment as bases and applying sparse coding (SC) to derive the activity likelihood of the test video sample at each segment, and 3) finally combining the likelihood at each segment to achieve a global posterior for the activities. We further extend the proposed method to include more bases that correspond to a mixture of segments with different temporal lengths (MSSC), which can better represent the activities with large intra-class variations. We evaluate the proposed methods (SC and MSSC) on various real videos. We also evaluate the proposed methods on two special cases: 1) activity prediction where the unobserved subsequence is at the end of the video, and 2) human activity recognition on fully observed videos. Experimental results show that the proposed methods outperform existing state-of-the-art comparison methods.
Abstract. We present an approach to figure/ground organization using mirror symmetry as a general purpose and biologically motivated prior. Psychophysical evidence suggests that the human visual system makes use of symmetry in producing three-dimensional (3-D) percepts of objects. 3-D symmetry aids in scene organization because (i) almost all objects exhibit symmetry, and (ii) configurations of objects are not likely to be symmetric unless they share some additional relationship. No general purpose approach is known for solving 3-D symmetry correspondence in two-dimensional (2-D) camera images, because few invariants exist. Therefore, we present a general purpose method for finding 3-D symmetry correspondence by pairing the problem with the two-view geometry of the binocular correspondence problem. Mirror symmetry is a spatially global property that is not likely to be lost in the spatially local noise of binocular depth maps. We tested our approach on a corpus of 180 images collected indoors with a stereo camera system. K -means clustering was used as a baseline for comparison. The informative nature of the symmetry prior makes it possible to cluster data without a priori knowledge of which objects may appear in the scene, and without knowing how many objects there are in the scene. IntroductionAccording to most studies of human vision, the first step in visual perception is determining whether there are objects in front of the observer: where they are and how many there are. This step (visual function) is called figure-ground organization (FGO). 1 The computer vision community refers to this problem as object discovery. As with all natural visual functions of human observers, FGO operates in three-dimensional (3-D) space, as opposed to the two-dimensional (2-D) retinal image. It follows that it is natural to think about visual mechanisms underlying FGO as based on 3-D operations. However, the fact that the input to the visual system is one or more 2-D retinal images encouraged previous researchers to look for a theory of FGO based on 2-D operations. This is how the human vision community studied FGO. Consider the prototypical example of Edgar Rubin's vase-faces stimulus.2 In this 2-D stimulus, there are two possible interpretations depending on which region is perceived as a "figure" as opposed to the "ground." Similar bistable stimuli have been used during the last several dozen years of FGO research in human vision.3,4 This research provided a large body of results, but few theories and computational models. Furthermore, the proposed models are usually not suitable for real retinal or camera images representing 3-D scenes. This paper breaks with this tradition and looks for 3-D operations that can establish the correct 3-D FGO.
We present a new algorithm for 3D shape reconstruction from stereo image pairs that uses mirror symmetry as a biologically inspired prior. 3D reconstruction requires some form of prior because it is an ill-posed inverse problem. Psychophysical research shows that mirror-symmetry is a key prior for 3D shape perception in humans, suggesting that a general purpose solution to this problem will have many applications. An approach is developed for finding objects that fit a given shape definition. The algorithm is developed for shapes with two orthogonal planes of symmetry, thus allowing for straightforward recovery of occluded portions of the objects. Two simulations were run to test: (1) the accuracy of 3D recovery, and (2) the ability of the algorithm to find the object in the presence of noise. We then tested the algorithm on the Children's Furniture Corpus, a corpus of stereo image pairs of mirror symmetric furniture objects. Runtimes and 3D reconstruction errors are reported and failure modes described.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.