2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00848
|View full text |Cite
|
Sign up to set email alerts
|

Ground-to-Aerial Image Geo-Localization With a Hard Exemplar Reweighting Triplet Loss

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
72
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 113 publications
(87 citation statements)
references
References 26 publications
0
72
0
Order By: Relevance
“…Effect of loss objectives. The triplet loss and contrastive loss are widely applied in previous works [5,6,16,35,43], and the weighted soft margin triplet loss is deployed in [4,12,18]. We evaluate these three losses on two tasks, i.e., Drone → Satellite and Satellite → Drone and compare three losses with the instance loss used in our baseline.…”
Section: Ablation Study and Further Discussionmentioning
confidence: 99%
“…Effect of loss objectives. The triplet loss and contrastive loss are widely applied in previous works [5,6,16,35,43], and the weighted soft margin triplet loss is deployed in [4,12,18]. We evaluate these three losses on two tasks, i.e., Drone → Satellite and Satellite → Drone and compare three losses with the instance loss used in our baseline.…”
Section: Ablation Study and Further Discussionmentioning
confidence: 99%
“…To steer where to focus in images, the attention mechanism is introduced to the field of cross-view geolocalization. Cai et al [2] introduce a lightweight attention module that combines spatial and channel attention mechanisms to emphasize visually salient features. They also propose a novel reweighting loss that adaptively allocates weights to triplets according to their difficulties, thus improving the quality of network training.…”
Section: Related Workmentioning
confidence: 99%
“…Towards the above goal, several recent works incorporate convolutional neural networks (CNNs) with NetVlad layers [8], capsule networks [20] or attention mechanisms [2,16] to learn visually discriminative representations. However, the locality assumption of their CNN architectures hinders their performance in complex scenarios, where visual interferences such as obstacles and transient objects (e.g., cars and pedestrian) may exist.…”
Section: Introductionmentioning
confidence: 99%
“…[19] showed that orientation information, in the form of hand-crafted UV maps, helps to convey the approximate viewpoint difference to the network during training. Recently, [5] applied both spatial and channel-wise attention to the feature maps and trained them with a hard exemplar reweighting triplet loss.…”
Section: Domain-invariant Features a Central Question In Cross-mentioning
confidence: 99%
“…Satellite imagery, on the other hand, is broadly available for most parts of the world with services like Google maps. This encouraged researchers to focus on cross-view imagebased geo-localization [40,18,36,19,31,5,33,32] as a more general and inclusive alternative. The overall idea is to predict the latitude and longitude of a street-level image by matching it against a GPS-tagged satellite database.…”
Section: Introductionmentioning
confidence: 99%