Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Danelljan, Martin; Robinson, Andreas; Khan, Fahad Shahbaz; Felsberg, Michael

doi:10.1007/978-3-319-46454-1_29

Cited by 1,491 publications

(1,328 citation statements)

References 40 publications

Supporting

Mentioning

1,326

Contrasting

Unclassified

Order By: Relevance

“…The Discriminative Scale Space Tracker (DSST) [2] tracker is essentially an extension of KCF that can handle scale changes and outperformed the KCF by a small margin in the VOT2014 challenge. As further axis-aligned trackers, we include ANT [17], L1APG [1], and the best performing tracker from the VOT2016 challenge, the continuous convolution filters (CCOT) from Danelljan et al [3]. We include the LGT [15] as one of the few open source trackers that estimates the object position as box-axis-aligned box-rot DSST [2] CCOT [3] ANT [17] L1APG [1] Figure 6: bmx-trees from DAVIS [12].…”

Section: Methodsmentioning

confidence: 99%

“…As further axis-aligned trackers, we include ANT [17], L1APG [1], and the best performing tracker from the VOT2016 challenge, the continuous convolution filters (CCOT) from Danelljan et al [3]. We include the LGT [15] as one of the few open source trackers that estimates the object position as box-axis-aligned box-rot DSST [2] CCOT [3] ANT [17] L1APG [1] Figure 6: bmx-trees from DAVIS [12]. On the left, differences between box-no-scale and box-axis-aligned indicate that the object is changing scale and is occluded at frame 18 and around frames 60-70.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Pattern Recognition

Böttger

Follmann

Fauser

2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The accuracy of object detectors and trackers is most commonly evaluated by the Intersection over Union (IoU) criterion. To date, most approaches are restricted to axis-aligned or oriented boxes and, as a consequence, many datasets are only labeled with boxes. Nevertheless, axis-aligned or oriented boxes cannot accurately capture an object's shape. To address this, a number of densely segmented datasets has started to emerge in both the object detection and the object tracking communities. However, evaluating the accuracy of object detectors and trackers that are restricted to boxes on densely segmented data is not straightforward. To close this gap, we introduce the relative Intersection over Union (rIoU) accuracy measure. The measure normalizes the IoU with the optimal box for the segmentation to generate an accuracy measure that ranges between 0 and 1 and allows a more precise measurement of accuracies. Furthermore, it enables an efficient and easy way to understand scenes and the strengths and weaknesses of an object detection or tracking approach. We display how the new measure can be efficiently calculated and present an easy-to-use evaluation framework. The framework is tested on the DAVIS and the VOT2016 segmentations and has been made available to the community.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Pattern Recognition

Böttger

Follmann

Fauser

2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…1(b). These trackers include MEEM [22], CN [23], KCF [24], HCSVT [25], DSST [26], CNN-SVM [27], and C-COT [28].…”

Section: B Quantitative Comparisonsmentioning

confidence: 99%

Layered Multitask Tracker via Spatial–Temporal Laplacian Graph

Fan

Cong

2017

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Abstract-Most multitask trackers define the trace of each candidate as one task, and assume all tasks are equally related. Multitask learning is only evaluated on the current frame. In fact, these assumptions are limited, and ignore the multitask relationship in consecutive frames. In this letter, we propose a discriminative layered multitask tracker via spatial-temporal Laplacian graphs, which defines the layered tasks from a novel view, and naturally incorporates the global and local target information into reverse multitask tracking process. The spatial-temporal Laplacian graphs not only exploit the sequential consistent information of the target, but also make full use of the geometric structure corresponding to the tasks among the adjacent frames. Besides, l 0 norm constraint and labeling information are used to improve the tracking robustness. Encouraging experimental results on challenging sequences justify that the proposed method performs well both in accuracy and robustness against some related trackers.

show abstract

“…Similarly, in order to infer the accurate location of the target object, a tracker needs to take changes of several appearance (illumination change, blurriness, occlusion) and dynamic (expanding, shrinking, aspect ratio change) properties into account. Although visual tracking research has achieved remarkable advances in the past decades [21-23, 32, 38-40], and thanks to deep learning especially in the recent years [6,8,29,35,36,41], most methods employ only a subset of these properties, or are too slow to perform in real-time.…”

Section: Introductionmentioning

confidence: 99%

“…41]. Similarly, for correlation filter based trackers, only some of the convolutional features are useful at a time [6,8,26,30]. Therefore, by introducing an adaptive selection of attentional properties, additional dynamic properties can be considered for increased accuracy and robustness while keeping the computational time constant.…”

Section: Introductionmentioning

confidence: 99%

Attentional Correlation Filter Network for Adaptive Visual Tracking

Choi

Chang²,

Yun

et al. 2017

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

301

170

View full text Add to dashboard Cite

We propose a new tracking framework with an attentional mechanism that chooses a subset of the associated correlation filters for increased robustness and computational efficiency. The subset of filters is adaptively selected by a deep attentional network according to the dynamic properties of the tracking target. Our contributions are manifold, and are summarised as follows: (i) Introducing the Attentional Correlation Filter Network which allows adaptive tracking of dynamic targets. (ii) Utilising an attentional network which shifts the attention to the best candidate modules, as well as predicting the estimated accuracy of currently inactive modules. (iii) Enlarging the variety of correlation filters which cover target drift, blurriness, occlusion, scale changes, and flexible aspect ratio. (iv) Validating the robustness and efficiency of the attentional mechanism for visual tracking through a number of experiments. Our method achieves similar performance to non real-time trackers, and state-of-the-art performance amongst real-time trackers.

show abstract

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Cited by 1,491 publications

References 40 publications

Pattern Recognition

Pattern Recognition

Layered Multitask Tracker via Spatial–Temporal Laplacian Graph

Attentional Correlation Filter Network for Adaptive Visual Tracking

Contact Info

Product

Resources

About