Towards Efficient and Exact MAP-Inference for Large Scale Discrete Computer Vision Problems via Combinatorial Optimization

Kappes, Jörg Hendrik; Speth, M.; Reinelt, Gerhard; Schnörr, Christoph

doi:10.1109/cvpr.2013.229

Cited by 27 publications

(26 citation statements)

References 25 publications

(43 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although these are medium sized binary problems, the relaxations over the local polytope are no longer as tight. Only the advanced polyhedral method (MCBC) [20] was able to solve some (56) instances to optimality. The matching problems in Table 6 have very few variables, which is ideal for sophisticated ILP solvers.…”

Section: Discussionmentioning

confidence: 99%

A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems

Kappes

Andres

Hamprecht

et al. 2013

2013 IEEE Conference on Computer Vision and Pattern Recognition

Self Cite

148

143

View full text Add to dashboard Cite

Seven years ago, Szeliski et al. published an influential study on energy minimization methods for Markov random fields (MRF). This study provided valuable insights in choosing the best optimization technique for certain classes of problems.While these insights remain generally useful today, the phenominal success of random field models means that the kinds of inference problems we solve have changed significantly. Specifically, the models today often include higher order interactions, flexible connectivity structures, large label-spaces of different cardinalities, or learned energy tables. To reflect these changes, we provide a modernized and enlarged study. We present an empirical comparison of 24 state-of-art techniques on a corpus of 2,300 energy minimization instances from 20 diverse computer vision applications. To ensure reproducibility, we evaluate all methods in the OpenGM2 framework and report extensive results regarding runtime and solution quality. Key insights from our study agree with the results of Szeliski et al. for the types of models they studied. However, on new and challenging types of models our findings disagree and suggest that polyhedral methods and integer programming solvers are competitive in terms of runtime and solution quality over a large range of model types.

show abstract

Section: Discussionmentioning

confidence: 99%

A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems

Kappes

Andres

Hamprecht

et al. 2013

2013 IEEE Conference on Computer Vision and Pattern Recognition

Self Cite

148

143

View full text Add to dashboard Cite

show abstract

“…Globally optimal results for benchmark datasets were reported [37,36] that compare well also in terms of runtime to state-of-the-art methods for approximate inference. However, a detailed evaluation of different separating procedures, its generalization to the higher order case as well as an analysis of the polyhedral relaxations were lacking.…”

Section: Related Workmentioning

confidence: 90%

“…finding an optimal multicut with at most k labels, which is known as the multiway cut problem. Compared to the standard (I)LP representation of such problems our approach is considerably more memory efficient and able to provide globally optimal solutions for many computer vision problems in reasonable runtime [36,37,38]. Fig.…”

Section: Overview Motivationmentioning

confidence: 99%

See 1 more Smart Citation

Higher-order segmentation via multicuts

Kappes

Speth

Reinelt

et al. 2016

Computer Vision and Image Understanding

Self Cite

View full text Add to dashboard Cite

Multicuts enable to conveniently represent discrete graphical models for unsupervised and supervised image segmentation, in the case of local energy functions that exhibit symmetries. The basic Potts model and natural extensions thereof to higher-order models provide a prominent class of such objectives, that cover a broad range of segmentation problems relevant to image analysis and computer vision. We exhibit a way to systematically take into account such higher-order terms for computational inference. Furthermore, we present results of a comprehensive and competitive numerical evaluation of a variety of dedicated cutting-plane algorithms. Our approach enables the globally optimal evaluation of a significant subset of these models, without compromising runtime. Polynomially solvable relaxations are studied as well, along with advanced rounding schemes for post-processing.

show abstract

“…The rationale behind MAP is the big progress [14] of efficient approximate MAP inference in recent years. We use a modified message passing implementation of [14]. We use tree-reweighted (TRW) [32] messaging schedules.…”

Section: Inference Algorithmmentioning

confidence: 99%

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

Kundu

Dellaert

et al. 2014

Lecture Notes in Computer Science

200

169

View full text Add to dashboard Cite

Abstract. We present an approach for joint inference of 3D scene structure and semantic labeling for monocular video. Starting with monocular image stream, our framework produces a 3D volumetric semantic + occupancy map, which is much more useful than a series of 2D semantic label images or a sparse point cloud produced by traditional semantic segmentation and Structure from Motion(SfM) pipelines respectively. We derive a Conditional Random Field (CRF) model defined in the 3D space, that jointly infers the semantic category and occupancy for each voxel. Such a joint inference in the 3D CRF paves the way for more informed priors and constraints, which is otherwise not possible if solved separately in their traditional frameworks. We make use of class specific semantic cues that constrain the 3D structure in areas, where multiview constraints are weak. Our model comprises of higher order factors, which helps when the depth is unobservable. We also make use of class specific semantic cues to reduce either the degree of such higher order factors, or to approximately model them with unaries if possible. We demonstrate improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences. Fig. 1. Overview of our system. From monocular image sequence, we first obtain 2D semantic segmentation, sparse 3D reconstruction and camera poses. We then build a volumetric 3D map which depicts both 3D structure and semantic labels.

show abstract

Towards Efficient and Exact MAP-Inference for Large Scale Discrete Computer Vision Problems via Combinatorial Optimization

Cited by 27 publications

References 25 publications

A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems

A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems

Higher-order segmentation via multicuts

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

Contact Info

Product

Resources

About