Abstract:As early stage of video processing, we introduce an iterative trajectory merging algorithm that produces a regionbased and hierarchical representation of the video sequence, called the Trajectory Binary Partition Tree (BPT). From this representation, many analysis and graph cut techniques can be used to extract partitions or objects that are useful in the context of specific applications.In order to define trajectories and to create a precise merging algorithm, color and motion cues have to be used. Both types… Show more
“…Hierarchical representations are employed in some methods, such as [2], [40], [28], [15], to represent the raw data from coarse to fine. A hierarchical representation usually starts from the segments at a relatively fine level, such as superpixels or over-segmented regions recovered from a contour probability map [2] for RGB data, or super-voxels [30], [6] in the case of RGBD data.…”
Section: A Related Workmentioning
confidence: 99%
“…On the other hand, exploiting temporal coherence also helps to better construct the hierarchical representation at each frame. For instance, in [28], long term trajectories are leveraged to help building BPTs.…”
Abstract-Video segmentation is an important building block for high level applications such as scene understanding and interaction analysis. While outstanding results are achieved in this field by state-of-the-art learning and model based methods, they are restricted to certain types of scenes or require a large amount of annotated training data to achieve object segmentation in generic scenes. On the other hand, RGBD data, widely available with the introduction of consumer depth sensors, provides actual world 3D geometry compared to 2D images. The explicit geometry in RGBD data greatly helps in computer vision tasks, but the lack of annotations in this type of data may also hinder the extension of learning based methods to RGBD.In this paper, we present a novel generic segmentation approach for 3D point cloud video (stream data) thoroughly exploiting the explicit geometry in RGBD. Our proposal is only based on low level features, such as connectivity and compactness. We exploit temporal coherence by representing the rough estimation of objects in a single frame with a hierarchical structure, and propagating this hierarchy along time. The hierarchical structure provides an efficient way to establish temporal correspondences at different scales of object-connectivity, and to temporally manage the splits and merges of objects. This allows updating the segmentation according to the evidence observed in the history. The proposed method is evaluated on several challenging datasets, with promising results for the presented approach.
“…Hierarchical representations are employed in some methods, such as [2], [40], [28], [15], to represent the raw data from coarse to fine. A hierarchical representation usually starts from the segments at a relatively fine level, such as superpixels or over-segmented regions recovered from a contour probability map [2] for RGB data, or super-voxels [30], [6] in the case of RGBD data.…”
Section: A Related Workmentioning
confidence: 99%
“…On the other hand, exploiting temporal coherence also helps to better construct the hierarchical representation at each frame. For instance, in [28], long term trajectories are leveraged to help building BPTs.…”
Abstract-Video segmentation is an important building block for high level applications such as scene understanding and interaction analysis. While outstanding results are achieved in this field by state-of-the-art learning and model based methods, they are restricted to certain types of scenes or require a large amount of annotated training data to achieve object segmentation in generic scenes. On the other hand, RGBD data, widely available with the introduction of consumer depth sensors, provides actual world 3D geometry compared to 2D images. The explicit geometry in RGBD data greatly helps in computer vision tasks, but the lack of annotations in this type of data may also hinder the extension of learning based methods to RGBD.In this paper, we present a novel generic segmentation approach for 3D point cloud video (stream data) thoroughly exploiting the explicit geometry in RGBD. Our proposal is only based on low level features, such as connectivity and compactness. We exploit temporal coherence by representing the rough estimation of objects in a single frame with a hierarchical structure, and propagating this hierarchy along time. The hierarchical structure provides an efficient way to establish temporal correspondences at different scales of object-connectivity, and to temporally manage the splits and merges of objects. This allows updating the segmentation according to the evidence observed in the history. The proposed method is evaluated on several challenging datasets, with promising results for the presented approach.
“…The literature on the topic has become prolific [7,43,2,28,27,19,11,10,4,29] and a number of techniques have become available, e.g. generative layered models [25,26], graph-based models [20,46,36] and spectral techniques [39,8,15,18,32,35,16].…”
Abstract. In recent years it has been shown that clustering and segmentation methods can greatly benefit from the integration of prior information in terms of must-link constraints. Very recently the use of such constraints has been integrated in a rigorous manner also in graph-based methods such as normalized cut. On the other hand spectral clustering as relaxation of the normalized cut has been shown to be among the best methods for video segmentation. In this paper we merge these two developments and propose to learn must-link constraints for video segmentation with spectral clustering. We show that the integration of learned must-link constraints not only improves the segmentation result but also significantly reduces the required runtime, making the use of costly spectral methods possible for today's high quality video.
“…Among unsupervised multiple segmentation methods that generate an exhaustive list of video segments, agglomerative or spectral clustering on superpixels/supervoxels has been popular [10,15,19,22,24,27,36,40,43]. Some approaches utilize tracked feature points [5,6,14,29,34].…”
Section: Related Workmentioning
confidence: 99%
“…Video segmentation has been defined by different researchers as separating foreground from background [4,28,37,45], identifying moving objects [12,29,34], creating segmentation proposals [3,28,31,45], computing hierarchical sets of coarse-tofine video segments [19,24,36,42], or generating motion segmentations [24,35]. Each of these definitions has its own merits and applications.…”
We propose a robust algorithm to generate video segment proposals. The proposals generated by our method can start from any frame in the video and are robust to complete occlusions. Our method does not assume specific motion models and even has a limited capability to generalize across videos. We build on our previous least squares tracking framework, where image segment proposals are generated and tracked using learned appearance models. The innovation in our new method lies in the use of two efficient moves, the merge move and free addition, to efficiently start segments from any frame and track them through complete occlusions, without much additional computation. Segment size interpolation is used for effectively detecting occlusions. We propose a new metric for evaluating video segment proposals on the challenging VSB-100 benchmark and present state-of-the-art results. Preliminary results are also shown for the potential use of our framework to track segments across different videos.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.