In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function's smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an α-β-swap: for a pair of labels α, β, this move exchanges the labels between an arbitrary set of pixels labeled α and another arbitrary set labeled β. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an α-expansion: for a label α, this move assigns an arbitrary set of pixels the label α. Our second algorithm, which requires the smoothness term to be a metric, generates a labeling such that there is no expansion move that decreases the energy. Moreover, this solution is within a known factor of the global minimum. We experimentally demonstrate the effectiveness of our approach on image restoration, stereo and motion. Energy minimization in early visionMany early vision problems require estimating some spatially varying quantity (such as intensity or disparity) from noisy measurements. Such quantities tend to be piecewise smooth; they vary smoothly at most points, but change dramatically at object boundaries. Every pixel p ∈ P must be assigned a label in some set L; for motion or stereo, the labels are disparities, while for image restoration they represent intensities. The goal is to find a labeling f that assigns each pixel p ∈ P a label f p ∈ L, where f is both piecewise smooth and consistent with the observed data.These vision problems can be naturally formulated in terms of energy minimization. In this framework, one seeks the labeling f that minimizes the energyHere E smooth measures the extent to which f is not piecewise smooth, while E data measures the disagreement between f and the observed data. Many different energy functions have been proposed in the literature. The form of E data is typicallywhere D p measures how appropriate a label is for the pixel p given the observed data. In image restoration, for example,2 , where i p is the observed intensity of the pixel p.The choice of E smooth is a critical issue, and many different functions have been proposed. For example, in standard regularization-based vision [6], E smooth makes f smooth everywhere. This leads to poor results at object boundaries. Energy functions that do not have this problem are called discontinuity-preserving. A large number of discontinuity-preserving energy functions have been proposed (see for example [7]). Geman and Geman's seminal paper [3] gave a Bayesian interpretation of many energy functions, and proposed a discontinuitypreserving energy function based on Markov Random Fields (MRF's).The major difficulty with energy minimization for early vision lies in the enormous computational costs. Typically these energy functions have many local minima (i.e., they are non-convex). Worse still...
In this paper we describe a new technique for general purpose interactive segmentation of N-dimensional images. The user marks certain pixels as "object" or "background" to provide hard constraints for segmentation. Additional soji constraints incorporate both boundary and region information. Graph cuts are used to find the globally optimal segmentation of the N-dimensional image. The obtained solution gives the best balance of boundary and region properties among all segmentations satishing the constraints. The topology o$our segmentation is unrestricted and both "object" and "background" segments may consist of several isolatedparts. Some experimental results are presented in the context ofphotohideo editing and medical image segmentation. We also demonstrate an interesting Gestalt example. A fast implementation of our segmentation method is possible via a new mar-$ow algorithm in [2].
Abstract. After [10,15,12,2,4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for energy minimization in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-style "push-relabel" methods and algorithms based on Ford-Fulkerson style augmenting paths. We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and interactive segmentation. In many cases our new algorithm works several times faster than any of the other methods making near real-time performance possible.
In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function's smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an α-β-swap: for a pair of labels α, β, this move exchanges the labels between an arbitrary set of pixels labeled α and another arbitrary set labeled β. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an α-expansion: for a label α, this move assigns an arbitrary set of pixels the label α. Our second algorithm, which requires the smoothness term to be a metric, generates a labeling such that there is no expansion move that decreases the energy. Moreover, this solution is within a known factor of the global minimum. We experimentally demonstrate the effectiveness of our approach on image restoration, stereo and motion. Energy minimization in early visionMany early vision problems require estimating some spatially varying quantity (such as intensity or disparity) from noisy measurements. Such quantities tend to be piecewise smooth; they vary smoothly at most points, but change dramatically at object boundaries. Every pixel p ∈ P must be assigned a label in some set L; for motion or stereo, the labels are disparities, while for image restoration they represent intensities. The goal is to find a labeling f that assigns each pixel p ∈ P a label f p ∈ L, where f is both piecewise smooth and consistent with the observed data.These vision problems can be naturally formulated in terms of energy minimization. In this framework, one seeks the labeling f that minimizes the energyHere E smooth measures the extent to which f is not piecewise smooth, while E data measures the disagreement between f and the observed data. Many different energy functions have been proposed in the literature. The form of E data is typicallywhere D p measures how appropriate a label is for the pixel p given the observed data. In image restoration, for example,2 , where i p is the observed intensity of the pixel p.The choice of E smooth is a critical issue, and many different functions have been proposed. For example, in standard regularization-based vision [6], E smooth makes f smooth everywhere. This leads to poor results at object boundaries. Energy functions that do not have this problem are called discontinuity-preserving. A large number of discontinuity-preserving energy functions have been proposed (see for example [7]). Geman and Geman's seminal paper [3] gave a Bayesian interpretation of many energy functions, and proposed a discontinuitypreserving energy function based on Markov Random Fields (MRF's).The major difficulty with energy minimization for early vision lies in the enormous computational costs. Typically these energy functions have many local minima (i.e., they are non-convex). Worse still...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.