2016
DOI: 10.1007/978-3-319-46448-0_30
|View full text |Cite
|
Sign up to set email alerts
|

Localizing and Orienting Street Views Using Overhead Imagery

Abstract: In this paper we aim to determine the location and orientation of a ground-level query image by matching to a reference database of overhead (e.g. satellite) images. For this task we collect a new dataset with one million pairs of street view and overhead images sampled from eleven U.S. cities. We explore several deep CNN architectures for crossdomain matching -Classification, Hybrid, Siamese, and Triplet networks. Classification and Hybrid architectures are accurate but slow since they allow only partial feat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
199
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
3
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 227 publications
(199 citation statements)
references
References 33 publications
0
199
0
Order By: Relevance
“…This is very important for photo-sharing platforms, for which only a fraction of the uploaded photos comes with geo-location. Authors of [151,152] worked towards this aim, by training a cross-view…”
Section: Multimodal Data Fusionmentioning
confidence: 99%
“…This is very important for photo-sharing platforms, for which only a fraction of the uploaded photos comes with geo-location. Authors of [151,152] worked towards this aim, by training a cross-view…”
Section: Multimodal Data Fusionmentioning
confidence: 99%
“…While DML loss often involves looking at pair or triplet as the example, recent works construct a batch of images as input instead, and only form pairs/triplets/clusters at the loss layer. Hence the mini-batch can be constructed randomly in any way as long as a lot of similar/dissimilar pairs/triplet can be formed (for example [33,27]).…”
Section: Training Mini-batch Constructionmentioning
confidence: 99%
“…They take advantage of deep neural networks [17,27,31,11] to construct a mapping from the data space to the embedding space so that the Euclidean distance in the embedding space can reflect the actual semantic distance between data points, i.e., a relatively large distance between inter-class samples and a relatively small distance between intra-class samples. Recently a variety of deep metric learning methods have been proposed and have demonstrated strong effectiveness in various tasks, such as image retrieval [30,23,19,5], person re-identification [26,37,48,2], and geo-localization [35,14,34]. Figure 1.…”
Section: Introductionmentioning
confidence: 99%