2023
DOI: 10.1109/tmm.2022.3206668
|View full text |Cite
|
Sign up to set email alerts
|

Learning Localization-Aware Target Confidence for Siamese Visual Tracking

Abstract: Siamese trackers based on 3D region proposal network (RPN) have shown remarkable success with deep Hough voting. However, using a single seed point feature as the cue for voting fails to produce high-quality 3D proposals. Additionally, the equal treatment of seed points in the voting process, regardless of their significance, exacerbates this limitation. To address these challenges, we propose a novel transformer-based voting scheme to generate better proposals. Specifically, a global-local transformer (GLT) m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 22 publications
(3 citation statements)
references
References 90 publications
0
3
0
Order By: Relevance
“…Results on KITTI. We compare our algorithm with ten stateof-the-art algorithms [2]- [7], [11], [12], [43], [44] on the KITTI dataset [17]. Besides M 2 -Track [7] and CXTrack [43] which adopt a motion-centric (MC) paradigm and a full framebased similarity (FFS) paradigm, respectively, our algorithm and all other compared algorithms use a region-based similarity paradigm.…”
Section: B Comparison With State-of-the-artsmentioning
confidence: 99%
“…Results on KITTI. We compare our algorithm with ten stateof-the-art algorithms [2]- [7], [11], [12], [43], [44] on the KITTI dataset [17]. Besides M 2 -Track [7] and CXTrack [43] which adopt a motion-centric (MC) paradigm and a full framebased similarity (FFS) paradigm, respectively, our algorithm and all other compared algorithms use a region-based similarity paradigm.…”
Section: B Comparison With State-of-the-artsmentioning
confidence: 99%
“…Transformer is originally proposed in the area of natural language processing (Vaswani et al 2017), showing an excellent ability in modeling long-range dependency. It becomes popular in computer vision recently and has been widely used for image classification (Liu et al 2021;Xu et al 2021;Zhang et al 2022a,b), object detection (Carion et al 2020;Wang et al 2021aWang et al , 2022, pose estimation (Xu et al 2022), and object tracking (Lan et al 2022;Nie et al 2022a). The success motivates researchers to extend it to 3D point cloud tasks.…”
Section: Transformermentioning
confidence: 99%
“…Among the array of deep learning approaches stand out as exemplary, demonstrating exceptional performance in fault diagnosis. Some typical deep learning models, for example, convolutional neural network (CNN) [6][7][8], deep belief network [9] and deep auto-encoder [10], already employed in fault diagnosis effectively. However, the effectiveness of deep learning applications is contingent upon a fundamental assumption, the learning and testing data adhere to a same distribution, originating from the same operating conditions [11].…”
Section: Introductionmentioning
confidence: 99%