2007 IEEE 11th International Conference on Computer Vision 2007
DOI: 10.1109/iccv.2007.4408984
|View full text |Cite
|
Sign up to set email alerts
|

Real-Time Visibility-Based Fusion of Depth Maps

Abstract: We present a viewpoint-based approach for the quick fusion of multiple stereo depth maps. Our method selects depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsistencies from the depth maps to produce a consistent surface. We advocate a two-stage process in which the first stage generates potentially noisy, overlapping depth maps from a set of calibrated images and the second stage fuses these depth maps to obtain an integrated surface with higher… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
229
0

Year Published

2008
2008
2020
2020

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 328 publications
(229 citation statements)
references
References 24 publications
0
229
0
Order By: Relevance
“…This motion-covariant feature is naturally dependent on the extent to which objects move, so should help separate buildings from cars, for example. 2 This cue is illustrated in the supplementary video.…”
Section: Cues From Point Cloudsmentioning
confidence: 99%
See 1 more Smart Citation
“…This motion-covariant feature is naturally dependent on the extent to which objects move, so should help separate buildings from cars, for example. 2 This cue is illustrated in the supplementary video.…”
Section: Cues From Point Cloudsmentioning
confidence: 99%
“…Our algorithm is able to accurately recognize objects and segment video frames without appearance-based descriptors or dense depth estimates obtained using e.g., dense stereo or laser range finders. The structure from motion, or SfM, community [1] has demonstrated the value of ego-motion derived data, and their modeling efforts have even extend to stationary geometry of cities [2]. However, the object recognition opportunities presented by the inferred motion and structure have largely been ignored 1 .…”
Section: Introductionmentioning
confidence: 99%
“…Given a video (and view pose) from a moving camera, we obtain coarse disparity estimates by finding maxima of normalized cross correlation (NCC) scores for 5×5 windows between pairs of frames, similar to [17], and fuse these estimates. 3 We quantitatively evaluate the methods by texture mapping one of our synthetic scenes (with sensor motion) and applying this approach. The results, in figure 7, show TSDF performing significantly better, and in particular the Generative methods suffering from a lack of disparity accuracy ( fig.…”
Section: Resultsmentioning
confidence: 99%
“…Some model pixel depths in an MRF, determining visibility from those depths [1,2], a general approach that has since been very popular, and even made real-time [3]. Others model voxel occupancies, initially considering visibility by visiting unoccluded regions of space first [4], but more recently using complex 3D MRF formulations [5,6], computing depth as a 3D segmentation between visible and invisible points [5], or computing probabilistic occupancies using long-range ray cliques to model visibility [6].…”
Section: Introductionmentioning
confidence: 99%
“…Gargallo and Sturm [6] proposed to formulate the 3D modeling from images as a Bayesian MAP problem, using multiple depth maps. Recently, Merrell et al [11] proposed a quick depth map fusion method to construct a consistent surface. They employed a weighted blending method based on the visibility constraint and confidences.…”
Section: Related Workmentioning
confidence: 99%