Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses

Brachmann, Eric; Rother, Carsten

doi:10.1109/iccv.2019.00442

Cited by 245 publications

(181 citation statements)

References 49 publications

Supporting

Mentioning

181

Contrasting

Order By: Relevance

“…As per [7], we also compare to a number of other well-known methods that are capable of making use of the 3D model [5,35,59]. Note that, in common with other learning-based methods [5,7,8], we ignore the Street scene, for which our method too was unable to produce reasonable results (the SfM reconstruction in the dataset appears to be of poor quality for this scene [8]).…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

Cavallari

Bertinetto

Mukhoti

et al. 2019

2019 International Conference on 3D Vision (3DV)

View full text Add to dashboard Cite

Many applications require a camera to be relocalised online, without expensive offline training on the target scene. Whilst both keyframe and sparse keypoint matching methods can be used online, the former often fail away from the training trajectory, and the latter can struggle in textureless regions. By contrast, scene coordinate regression (SCoRe) methods generalise to novel poses and can leverage dense correspondences to improve robustness, and recent work has shown how to adapt SCoRe forests between scenes, allowing their state-of-the-art performance to be leveraged online. However, because they use features hand-crafted for indoor use, they do not generalise well to harder outdoor scenes. Whilst replacing the forest with a neural network and learning suitable features for outdoor use is possible, the techniques used to adapt forests between scenes are unfortunately harder to transfer to a network context. In this paper, we address this by proposing a novel way of leveraging a network trained on one scene to predict points in another scene. Our approach replaces the appearance clustering performed by the branching structure of a regression forest with a two-step process that first uses the network to predict points in the original scene, and then uses these predicted points to look up clusters of points from the new scene. We show experimentally that our online approach achieves state-of-the-art performance on both the 7-Scenes and Cambridge Landmarks datasets, whilst running in under 300ms, making it highly effective in live scenarios.

show abstract

Section: Methodsmentioning

confidence: 99%

“…(iv) Local regression methods generally use regression forests [63,27,69,6,48,13,49,50,12], neural networks [5,7,18,42,43,8], or a mix of the two [46] to predict the scene coordinates of pixels in the input image. They then pass these correspondences to PnP/Kabsch and RANSAC.…”

Section: Back-project Pointsmentioning

confidence: 99%

Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

Cavallari

Bertinetto

Mukhoti

et al. 2019

2019 International Conference on 3D Vision (3DV)

View full text Add to dashboard Cite

show abstract

“…Instead of learning the en-tire pipeline, scene coordinate regression methods learn the first stage of the pipeline in the structure-based approaches. Namely, either a random forest [4,12,13,20,30,32,33,50,57] or a neural network [3,5,6,7,9,10,11,27,28,30] is trained to directly predict 3D scene coordinates for the pixels and thus the 2D-3D correspondences are established. These methods do not explicitly rely on feature detection, description and matching, and are able to provide correspondences densely.…”

Section: Related Workmentioning

confidence: 99%

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

Wang

Zhao

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

106

View full text Add to dashboard Cite

Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the mapping between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly performed by the forward pass through the network. However, in a large and ambiguous environment, learning such a regression task directly can be difficult for a single network. In this work, we present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarseto-fine manner from a single RGB image. The network consists of a series of output layers with each of them conditioned on the previous ones. The final output layer predicts the 3D coordinates and the others produce progressively finer discrete location labels. The proposed method outperforms the baseline regression-only network and allows us to train single compact models which scale robustly to large environments. It sets a new state-of-the-art for singleimage RGB localization performance on the 7-Scenes, 12-Scenes, Cambridge Landmarks datasets, and three combined scenes. Moreover, for large-scale outdoor localization on the Aachen Day-Night dataset, our approach is much more accurate than existing scene coordinate regression approaches, and reduces significantly the performance gap w.r.t. explicit feature matching approaches.

show abstract

“…DSAC [9] adopts a probabilistic selecting process to make it differential so that the complete process can be trained in an end-to-end manner, while Marginalizing Sample Consensus (MAGSAC) [6] proposes to find the optimal model through weighted least-squares fitting without estimation of an inlier-outlier threshold. Recently, Neural-Guided RANSAC(NG-RANSAC) [10] employs deep networks to first estimate the confidence of the putative correspondences being inliers to guide the matching process with improved model hypothesis searching. As the most closely related to our approach, PointCN [27] employs the PointNet-like [33,34] architecture to classify every pair of correspondences as either inlier or outlier and then uses the weighted eight-point algorithm for essential matrix estimation.…”

Section: Related Workmentioning

confidence: 99%

“…We train our network on the subsets of the two datasets and evaluate the correspondence estimation performance on the other subsets of the same scenes as the test set and the datasets from the other scenes as the test set. For thorough evaluation, we compare against both the traditional baseline technique and the state-of-the-art approaches including [27], [54] and − [10]. We adopt the standard angular difference between the estimation and the ground truth and measure the mean average precision (mAP) under accuracy a threshold (5 • ) for both rotation and translation.…”

Section: Evaluation Of Correspondencementioning

confidence: 99%

Neural3D: Light-weight Neural Portrait Scanning via Context-aware Correspondence Learning

Suo

Zhang³

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses

Cited by 245 publications

References 49 publications

Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

Neural3D: Light-weight Neural Portrait Scanning via Context-aware Correspondence Learning

Contact Info

Product

Resources

About