2020
DOI: 10.48550/arxiv.2007.13867
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Robust Image Retrieval-based Visual Localization using Kapture

Abstract: In this paper, we present a versatile method for visual localization. It is based on robust image retrieval for coarse camera pose estimation and robust local features for accurate pose refinement. Our method is top ranked on various public datasets showing its ability of generalization and its great variety of applications. To facilitate experiments, we introduce kapture, a flexible data format and processing pipeline for structure from motion and visual localization that is released open source. We furthermo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
32
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 15 publications
(34 citation statements)
references
References 50 publications
0
32
0
Order By: Relevance
“…NetVLAD [1] is chosen as the global descriptor. We also find that fusing multiple global features together in a similar way as [10] is helpful.…”
Section: By Global Descriptorsmentioning
confidence: 80%
“…NetVLAD [1] is chosen as the global descriptor. We also find that fusing multiple global features together in a similar way as [10] is helpful.…”
Section: By Global Descriptorsmentioning
confidence: 80%
“…We compare our retrieval method to AP-GeM (Revaud et al, 2019a) and HOW (Tolias et al, 2020) and report results in Table 7. AP-GeM is the default method used in Kapture (Humenberger et al, 2020). We observe that using FIRe leads to better visual localization, specially in the most challenging scenario of night image localization and the strictest localization threshold: Performance improves by 2% compared to AP-GeM and by 1.5% compared to HOW on night images at a threshold of 0.25m and 2°.…”
mentioning
confidence: 88%
“…In this section, we evaluate FIRe for the task of visual localization, where retrieval is used as a first-stage filtering and before more precise, local feature-based geometric matching. To this end, we follow the pipeline proposed by Kapture 6 (Humenberger et al, 2020) on the Aachen Day-Night v1.1 dataset (Sattler et al, 2018). In this scenario, a global Structure-from-Motion map is built Retrieval method Day images Night images 0.25m, 2°0.5m, 5°5m, 10°0.25m, 2°0.5m, 5°5m, 10°A P-GeM (Revaud et al, 2019a) 88.8 96.6 99.6 72.3 86.9 97.9 HOW (Tolias et al, 2020) 90 Percentage of successfully localized images on the Aachen Day-Night v1.1 dataset when changing the retrieval method in the Kapture pipeline from Humenberger et al (2020).…”
Section: Application To Visual Localizationmentioning
confidence: 99%
“…RGB-D variants of scene coordinate regression methods dominate rankings for indoor re-localisation, which has been attributed to the inherent difficulty of the indoor scenario regarding texture-less surfaces and ambiguous structures that make it difficult to find and match sparse features [5,38,71,83]. For outdoor re-localisation, classical approaches, which match hand-crafted [61,71,81] or learned descriptors [21,32,57,58] at sparse feature locations to a 3D SfM reconstruction, achieve vastly superior results compared to scene coordinate regression. This has been attributed to an inability of scene coordinate regression to scale to spatially large scenes [62,77].…”
Section: Related Workmentioning
confidence: 99%
“…hLoc [57] combines image retrieval with Su-perPoint [21] features and SuperGlue [58] for matching, followed by P3P+RANSAC-based pose estimation. Den-seVLAD+R2D2 [32,55,78] uses DenseVLAD [78] for retrieving image pairs and R2D2 features for matching. The training images and poses are used to construct a 3D SfM map, and test images are localised using 2D-3D matches and P3P+RANSAC.…”
Section: Re-localisation Evaluationmentioning
confidence: 99%