2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00628
|View full text |Cite
|
Sign up to set email alerts
|

Learning Discriminative Model Prediction for Tracking

Abstract: The current strive towards end-to-end trainable computer vision systems imposes major challenges for the task of visual tracking. In contrast to most other vision problems, tracking requires the learning of a robust target-specific appearance model online, during the inference stage. To be end-to-end trainable, the online learning of the target model thus needs to be embedded in the tracking architecture itself. Due to these difficulties, the popular Siamese paradigm simply predicts a target feature template. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
1,138
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 1,154 publications
(1,143 citation statements)
references
References 39 publications
3
1,138
0
2
Order By: Relevance
“…With a total of 123 videos, the size of the dataset is approximately 13.5 G. Figure 12 compares our tracker with four state-of-the-art trackers in terms of success rate and speed. The success rate of our tracker is slightly lower than that of DiMP50 [ 17 ], yet its speed is higher. Moreover, although DiMP18 is faster than our tracker, its success rate is lower.…”
Section: Experiments and Discussionmentioning
confidence: 86%
See 2 more Smart Citations
“…With a total of 123 videos, the size of the dataset is approximately 13.5 G. Figure 12 compares our tracker with four state-of-the-art trackers in terms of success rate and speed. The success rate of our tracker is slightly lower than that of DiMP50 [ 17 ], yet its speed is higher. Moreover, although DiMP18 is faster than our tracker, its success rate is lower.…”
Section: Experiments and Discussionmentioning
confidence: 86%
“…The networks of such approaches require constant fine-tuning, preventing real-time tracking requirements to be met. From the perspective of methods, in addition to the Siamese network-based methods that have dominated in recent years (e.g., [ 12 ]), a research branch began to focus on small sample learning target tracking methods represented by Meta Learning (e.g., [ 13 , 14 ]), with another research branch always insisting on the use of correlation filter approaches (e.g., [ 15 , 16 , 17 , 18 ]). Ocean [ 12 ] represents the trackers based on a Siamese network evolved from Anchor-Based to Anchor-Free.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…They implement fine-tuning the backbone for the end-toend training. After an analysis on the impact of different feature blocks in DiMP [5], they use the features from block3 and block4 for IoU-Net, and only from block4 for the classifier. The feature extractor F is shared and only performed on a single image patch per frame.…”
Section: Baseline Rgb Trackermentioning
confidence: 99%
“…Finally, we can see how fine-tuning only on RGB improves the performance of the pre-trained model, but to a lesser extent than using TIR. In the lower part of Table 1 we analyze the effectiveness of each fusion mechanism for DiMP [5], which we discuss in detail in the remainder of this section. Pixel-level fusion.…”
Section: Implementation Detailsmentioning
confidence: 99%