2016
DOI: 10.1007/978-3-319-46448-0_45
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Track at 100 FPS with Deep Regression Networks

Abstract: Abstract. Machine learning techniques are often used in computer vision due to their ability to leverage large amounts of training data to improve performance. Unfortunately, most generic object trackers are still trained from scratch online and do not benefit from the large number of videos that are readily available for offline training. We propose a method for offline training of neural networks that can track novel objects at test-time at 100 fps. Our tracker is significantly faster than previous methods t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
805
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 995 publications
(806 citation statements)
references
References 40 publications
1
805
0
Order By: Relevance
“…The common approach is to specify a target by means of a bounding box around the object and to track this target as it moves throughout the video [38,33,20]. The paradigm has proven to be effective and considerable progress has been achieved [17,37,34,3,11]. Yet, the fundamental assumption of having a bounding box target specification available has never been challenged.…”
Section: Introductionmentioning
confidence: 99%
“…The common approach is to specify a target by means of a bounding box around the object and to track this target as it moves throughout the video [38,33,20]. The paradigm has proven to be effective and considerable progress has been achieved [17,37,34,3,11]. Yet, the fundamental assumption of having a bounding box target specification available has never been challenged.…”
Section: Introductionmentioning
confidence: 99%
“…In order to describe the target better, we applied convolutional networks to learn robust representations for visual tracking without offline training using a large amount of auxiliary data, which is inspired by recent studies [11,18]. First, we use predefined convolutional filters to extract the high-order features.…”
Section: Convolutional Network Modelmentioning
confidence: 99%
“…In (Hong et al, 2015a), sampled feature maps are classified by SVM to generate a saliency map. More recently, a number of studies use Recurrent Neural Networks (RNNs) for visual tracking (Bertinetto et al, 2016, Held et al, 2016, Chen and Tao, 2016. In (Held et al, 2016), Held et.…”
Section: Related Workmentioning
confidence: 99%
“…In the first image, the target is cropped as target template with some background texture. Using the sample generation idea from (Held et al, 2016), we randomly shift and scale the target in the second image as search image to simulate the motion of the target and camera simultaneously. In tracking, we generally cropped a search image in the new frame based on the target's previous location instead to track the target in the whole image.…”
Section: Heat Map For Target Localization Predictionmentioning
confidence: 99%
See 1 more Smart Citation