The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU's to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a twostep stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.
We present a viewpoint-based approach for the quick fusion of multiple stereo depth maps. Our method selects depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsistencies from the depth maps to produce a consistent surface. We advocate a two-stage process in which the first stage generates potentially noisy, overlapping depth maps from a set of calibrated images and the second stage fuses these depth maps to obtain an integrated surface with higher accuracy, suppressed noise, and reduced redundancy. We show that by dividing the processing into two stages we are able to achieve a very high throughput because we are able to use a computationally cheap stereo algorithm and because this architecture is amenable to hardwareaccelerated (GPU) implementations. A rigorous formulation based on the notion of stability of a depth estimate is presented first. It aims to determine the validity of a depth estimate by rendering multiple depth maps into the reference view as well as rendering the reference depth map into the other views in order to detect occlusions and freespace violations. We also present an approximate alternative formulation that selects and validates only one hypothesis based on confidence. Both formulations enable us to perform video-based reconstruction at up to 25 frames per second. We show results on the Multi-View Stereo Evaluation benchmark datasets and several outdoors video sequences. Extensive quantitative analysis is performed using an accurately surveyed model of a real building as ground truth.
De partm e n t o f C o m pute r S c ie n c e 2 C e n te r fo r V is ualiz atio n an d V irtual E n v iro n m e n ts U n iv e rs ity o f N o rth C aro lin a U n iv e rs ity o f K e n tuc k y C h ape l H ill, U S A L e x in g to n , U S A A b stractRecent research has focused on systems for obtaining automatic 3 D reconstructions of urban env ironments from v ideo acq uired at street lev el. T hese systems record enormous amounts of v ideo; therefore a k ey comp onent is a stereo matcher w hich can p rocess this data at sp eeds comp arable to the recording frame rate. F urthermore, urban env ironments are uniq ue in that they ex hibit mostly p lanar surfaces. T hese surfaces, w hich are often imaged at obliq ue angles, p ose a challenge for many w indow -based stereo matchers w hich suffer in the p resence of slanted surfaces. W e p resent a multi-v iew p lane-sw eep -based stereo algorithm w hich correctly handles slanted surfaces and runs in real-time using the grap hics p rocessing unit (G P U ). O ur algorithm consists of (1 ) identifying the scene's p rincip le p lane orientations, (2 ) estimating dep th by p erforming a p lane-sw eep for each direction, (3 ) combining the results of each sw eep . T he latter can op tionally be p erformed using grap h cuts. A dditionally, by incorp orating p riors on the locations of p lanes in the scene, w e can increase the q uality of the reconstruction and reduce comp utation time, esp ecially for uniform tex tureless surfaces. W e demonstrate our algorithm on a v ariety of scenes and show the imp rov ed accuracy obtained by accounting for slanted surfaces. . I ntrod uctionR e c o n s truc tio n s o f b uild in g s in 3 D fro m ae rial o r s ate llite im ag e ry h as lo n g b e e n a to pic o f re s e arc h in c o m pute r v is io n an d ph o to g ram m e try . T h e s uc c e s s o f s uc h re s e arc h c an b e s e e n in applic atio n s s uc h as Go o g le E arth an d M ic ro s o ft V irtual E arth , w h ic h n o w o ffe r 3 D v is ualiz atio n s o f s e v e ral c itie s . H o w e v e r, s uc h v is ualiz atio n s lac k g ro un dle v e l re alis m , d ue m o s tly to th e po in t o f v ie w o f th e imag e ry . A d iffe re n t appro ac h is to g e n e rate v is ualiz atio n s in th e fo rm o f pan o ram as [16,12 ] w h ic h re q uire le s s d ata to b e c o n s truc te d b ut als o lim it th e us e r's ab ility to fre e ly n avig ate th e e n v iro n m e n t. R e c e n t re s e arc h h as fo c us e d o n s y ste m s fo r o b tain in g auto m atic 3 D re c o n s truc tio n s o f urb an e n v iro n m e n ts fro m v id e o ac q uire d at s tre e t le v e l [15, 13 , 6].U rb an e n v iro n m e n ts are un iq ue in th at th e y e x h ib it m o s tly plan ar s urfac e s . A ty pic al im ag e , fo r e x am ple , m ay c o n tain a g ro un d plan e , an d m ultiple fac ad e plan e s in te rs e c tin g at rig h t an g le s . M an y s y s te m s aim to re c o n s truc t s uc h im ag e ry us in g s pars e te c h n iq ue s , w h ic h e x am in e po in t o r lin e c o rre s po n d e ...
While machine learning has been instrumental to the ongoing progress in most areas of computer vision, it has not been applied to the problem of stereo matching with similar frequency or success. We present a supervised learning approach for predicting the correctness of stereo matches based on a random forest and a set of features that capture various forms of information about each pixel.We show highly competitive results in predicting the correctness of matches and in confidence estimation, which allows us to rank pixels according to the reliability of their assigned disparities. Moreover, we show how these confidence values can be used to improve the accuracy of disparity maps by integrating them with an MRF-based stereo algorithm. This is an important distinction from current literature that has mainly focused on sparsification by removing potentially erroneous disparities to generate quasi-dense disparity maps.
The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in georegistered coordinates. Besides high quality in terms of both geometry and appearance, we aim at real-time performance. Even though our processing pipeline is currently far from being real-time, we select techniques and we design processing modules that can achieve fast performance on multiple CPUs and GPUs aiming at real-time performance in the near future. We present the main considerations in designing the system and the steps of the processing pipeline. We show results on real video sequences captured by our system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.