Abstract-Unsupervised image segmentation is an important component in many image understanding algorithms and practical vision systems. However, evaluation of segmentation algorithms thus far has been largely subjective, leaving a system designer to judge the effectiveness of a technique based only on intuition and results in the form of a few example segmented images. This is largely due to image segmentation being an ill-defined problem-there is no unique ground-truth segmentation of an image against which the output of an algorithm may be compared. This paper demonstrates how a recently proposed measure of similarity, the Normalized Probabilistic Rand (NPR) index, can be used to perform a quantitative comparison between image segmentation algorithms using a hand-labeled set of ground-truth segmentations. We show that the measure allows principled comparisons between segmentations created by different algorithms, as well as segmentations on different images. We outline a procedure for algorithm evaluation through an example evaluation of some familiar algorithms-the mean-shift-based algorithm, an efficient graph-based segmentation algorithm, a hybrid algorithm that combines the strengths of both methods, and expectation maximization. Results are presented on the 300 images in the publicly available Berkeley Segmentation Data Set.
Quantitative evaluation and comparison of image segmentation algorithms is now feasible owing to the recent availability of collections of hand-labeled images. However, little attention has been paid to the design of measures to compare one segmentation result to one or more manual segmentations of the same image. Existing measures in statistics and computer vision literature suffer either from intolerance to labeling refinement, making them unsuitable for image segmentation, or from the existence of degenerate cases, making the process of training algorithms using the measures to be prone to failure. This paper surveys previous work on measures of similarity and illustrates scenarios where they are applicable for performance evaluation in computer vision. For the image segmentation problem, we propose a measure that addresses the above concerns and has desirable properties such as accommodation of labeling errors at segment boundaries, region sensitive refinement, and compensation for differences in segment ambiguity between images.
This document is the extended version of the work published in [11]. Laser-based range sensors are commonly used on-board autonomous mobile robots for obstacle detection and scene understanding. A popular methodology for analyzing point cloud data from these sensors is to train Bayesian classifiers using locally computed features on labeled data and use them to compute class posteriors on-line at testing time. However, data from range sensors present a unique challenge for feature computation in the form of significant variation in spatial density of points, both across the field-of-view as well as within structures of interest. In particular, this poses the problem of choosing a scale for analysis and a support-region size for computing meaningful features reliably. While scale theory has been rigorously developed for 2-D images, no equivalent exists for unorganized 3-D point data. Choosing a satisfactory fixed scale over the entire dataset makes feature extraction sensitive to the presence of different manifolds in the data and varying data density. We adopt an approach inspired by recent developments in computational geometry [17] and investigate the problem of automatic data-driven scale selection to improve point cloud classification. The approach is validated with results using real data from different sensors in various environments (indoor, urban outdoor and natural outdoor) classified into different terrain types (vegetation, solid surface and linear structure).
Several computer vision algorithms rely on detecting a compact but representative set of interest regions and their associated descriptors from input data. When the input is in the form of an unorganized 3D point cloud, current practice is to compute shape descriptors either exhaustively or at randomly chosen locations using one or more preset neighborhood sizes. Such a strategy ignores the relative variation in the spatial extent of geometric structures and also risks introducing redundancy in the representation. This paper pursues multi-scale operators on point clouds that allow detection of interest regions whose locations as well as spatial extent are completely data-driven. The approach distinguishes itself from related work by operating directly in the input 3D space without assuming an available polygon mesh or resorting to an intermediate global 2D parameterization. Results are shown to demonstrate the utility and robustness of the proposed method. *
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.