2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00579
|View full text |Cite
|
Sign up to set email alerts
|

Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
37
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 44 publications
(37 citation statements)
references
References 34 publications
0
37
0
Order By: Relevance
“…To mitigate the gap, we propose an asymmetrical searching strategy (ASS) to adapt the unified VL representation for improvements. Different from current VL tracking [19] adopting symmetrical and fixed template and search branches as in vision-only Siamese tracking [30], we argue that the learning framework of mixed modality should be adaptive and not fixed. To this end, ASS borrows the idea from neural architecture search (NAS) [62,40] to separately learn distinctive and asymmetrical networks for mixed modality in different branches and ModaMixers.…”
mentioning
confidence: 74%
See 4 more Smart Citations
“…To mitigate the gap, we propose an asymmetrical searching strategy (ASS) to adapt the unified VL representation for improvements. Different from current VL tracking [19] adopting symmetrical and fixed template and search branches as in vision-only Siamese tracking [30], we argue that the learning framework of mixed modality should be adaptive and not fixed. To this end, ASS borrows the idea from neural architecture search (NAS) [62,40] to separately learn distinctive and asymmetrical networks for mixed modality in different branches and ModaMixers.…”
mentioning
confidence: 74%
“…Vision-Language Tracking. Natural language contains high-level semantics and has been leveraged to foster vision-related tasks [20,29,2] including tracking [32,18,19]. The work [32] first introduces linguistic description to tracking and shows that language enhances the robustness of vision-based method.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations