2016 IEEE Winter Conference on Applications of Computer Vision (WACV) 2016
DOI: 10.1109/wacv.2016.7477569
|View full text |Cite
|
Sign up to set email alerts
|

Online tracking using saliency

Abstract: When tracking small moving objects, primates use smooth pursuit eye movements to keep a target in the center of the field of view. In this paper, we propose the Smooth Pursuit tracking algorithm, which uses three kinds of saliency maps to perform online target tracking: appearance, location, and motion. In addition to tracking single targets, our method can track multiple targets with little additional overhead. The appearance saliency map uses deep convolutional neural network features along with gnostic fiel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 42 publications
0
5
0
Order By: Relevance
“…Driven by the reintroduction and improvement of Convolutional Neural Networks (CNNs) (LeCun et al, 1989;He et al, 2016), the availability of large-scale datasets (Deng et al, 2009), and the affordability of high-performance compute resources such as graphics processing units (GPUs), deep learning has enjoyed unprecedented popularity in recent years. This success in computer vision domains such as image labeling (Krizhevsky et al, 2012), object detection (Girshick et al, 2014), semantic segmentation (Badrinarayanan et al, 2017;Long et al, 2015), and target tracking (Wang and Yeung, 2013;Yousefhussien et al, 2016), has generated an interest in applying these frameworks for 3D classification.…”
Section: Indirect Methodsmentioning
confidence: 99%
“…Driven by the reintroduction and improvement of Convolutional Neural Networks (CNNs) (LeCun et al, 1989;He et al, 2016), the availability of large-scale datasets (Deng et al, 2009), and the affordability of high-performance compute resources such as graphics processing units (GPUs), deep learning has enjoyed unprecedented popularity in recent years. This success in computer vision domains such as image labeling (Krizhevsky et al, 2012), object detection (Girshick et al, 2014), semantic segmentation (Badrinarayanan et al, 2017;Long et al, 2015), and target tracking (Wang and Yeung, 2013;Yousefhussien et al, 2016), has generated an interest in applying these frameworks for 3D classification.…”
Section: Indirect Methodsmentioning
confidence: 99%
“…In [5], color, texture and saliency features are used for mean-shift tracking algorithm. The previous works [3], [4], [5] only use low-level visual saliency information which is opposite to our intention to exploit semantically higher level information. Moreover, in [5] the features are combined in a straightforward manner (concatenation).…”
Section: Introductionmentioning
confidence: 90%
“…These features were selected from predefined hand-crafted features (such as SIFT). In [4], the method extracts motion, appearance and location saliency maps and predicts the next location of objects via multiplication of all saliency maps with the input initial tracking bounding box(es). In [5], color, texture and saliency features are used for mean-shift tracking algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…Zhang et al [35] proposed an online discriminative feature selection method that couples the classifier score with the importance of samples, leading to a more robust and efficient tracker. Yousefhussien, Browning and Kanan [32] propose the Smooth Pursuit Tracking (SPT) algorithm which uses three kinds of saliency maps: appearance, location, and motion to track objects under all conditions, including long-term occlusion. Elliethy and Sharma [6] proposed an innovative approach to register captured WAMI frames with vector road map data(Open Street Map [15]) and track vehicles within those registered frames simultaneously, leading to efficient results.…”
Section: Rgb Aerial Imagerymentioning
confidence: 99%
“…Aerial imagery typically yields a relatively small number of pixels on a target (roughly 20-100 pixels) and comparatively lower sampling rates (1-2 Hz) than common traditional rates degrading the performance of apperance-based trackingby-detection methods. Different sensor modalities such as infrared [2,8,3], Wide Area Motion Imagery (WAMI) [22,4,5] and RGB [23,35,32,6] have all shown the potential to improve tracking, however most of them perform poorly to achieve persistent tracking in real-time due to the unique challenges posed by aerial imagery or dependency on external sources of information (e.g. road map information) for achieving optimum results.…”
Section: Introductionmentioning
confidence: 99%