2022
DOI: 10.48550/arxiv.2201.02001
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TransVPR: Transformer-based place recognition with multi-level attention aggregation

Abstract: Visual place recognition is a challenging task for applications such as autonomous driving navigation and mobile robot localization. Distracting elements presenting in complex scenes often lead to deviations in the perception of visual place. To address this problem, it is crucial to integrate information from only task-relevant regions into image representations. In this paper, we introduce a novel holistic place recognition model, TransVPR, based on vision Transformers. It benefits from the desirable propert… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 49 publications
0
2
0
Order By: Relevance
“…Especially since Vision Transformers have been shown to transfer well to other vision tasks [19]. Recent transformer-based works on object reidentification [20], and VPR [21] support this trend and have achieved state-of-the-art performance.…”
Section: A Visual Place Recognitionmentioning
confidence: 99%
See 1 more Smart Citation
“…Especially since Vision Transformers have been shown to transfer well to other vision tasks [19]. Recent transformer-based works on object reidentification [20], and VPR [21] support this trend and have achieved state-of-the-art performance.…”
Section: A Visual Place Recognitionmentioning
confidence: 99%
“…Our initial model consist of a pretrained Vision Transformer (ViT-B/16 with resolution of 224 [18]) for which we removed the classification head. Our choice is motivated by the recent success of Vision Transformers for place recognition [21]. We used the feature vector from the penultimate layer (dimension 1x768) as the global place descriptor.…”
Section: A Training a New Vpr Systemmentioning
confidence: 99%