In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction. Figure 1: Examples of matches obtained by the D2-Net method. The proposed method can find image correspondences even under significant appearance differences caused by strong changes in illumination such as day-to-night, changes in depiction style or under image degradation caused by motion blur.ciently via (approximate) nearest neighbor search [37] and the Euclidean distance. Sparse features offer a memory efficient representation and thus enable approaches such as Structure-from-Motion (SfM) [21,53] or visual localization [26,47,58] to scale. The keypoint detector typically considers low-level image information such as corners [19] or blob-like structures [30,32]. As such, local features can
Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing condition, including day-night changes, as well as weather and seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing the impact of such factors on visual localization. Using carefully created ground truth poses for query images taken under a wide variety of conditions, we evaluate the impact of various factors on 6DOF camera pose estimation accuracy through extensive experiments with state-of-the-art localization approaches. Based on our results, we draw conclusions about the difficulty of different conditions, showing that long-term localization is far from solved, and propose promising avenues for future work, including sequence-based localization approaches and the need for better local features. Our benchmark is available at visuallocalization.net.
Accurately determining the position and orientation from which an image was taken, i.e., computing the camera pose, is a fundamental step in many Computer Vision applications. The pose can be recovered from 2D-3D matches between 2D image positions and points in a 3D model of the scene. Recent advances in Structure-from-Motion allow us to reconstruct large scenes and thus create the need for image-based localization methods that efficiently handle large-scale 3D models while still being effective, i.e., while localizing as many images as possible. This paper presents an approach for large scale image-based localization that is both efficient and effective. At the core of our approach is a novel prioritized matching step that enables us to first consider features more likely to yield 2D-to-3D matches and to terminate the correspondence search as soon as enough matches have been found. Matches initially lost due to quantization are efficiently recovered by integrating 3D-to-2D search. We show how visibility information from the reconstruction process can be used to improve the efficiency of our approach. We evaluate the performance of our method through extensive experiments and demonstrate that it offers the best combination of efficiency and effectiveness among current state-of-the-art approaches for localization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.