Online tracking using saliency

Yousefhussien, Mohammed; Browning, N. Andrew; Kanan, Christopher

doi:10.1109/wacv.2016.7477569

Cited by 7 publications

(5 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Driven by the reintroduction and improvement of Convolutional Neural Networks (CNNs) (LeCun et al, 1989;He et al, 2016), the availability of large-scale datasets (Deng et al, 2009), and the affordability of high-performance compute resources such as graphics processing units (GPUs), deep learning has enjoyed unprecedented popularity in recent years. This success in computer vision domains such as image labeling (Krizhevsky et al, 2012), object detection (Girshick et al, 2014), semantic segmentation (Badrinarayanan et al, 2017;Long et al, 2015), and target tracking (Wang and Yeung, 2013;Yousefhussien et al, 2016), has generated an interest in applying these frameworks for 3D classification.…”

Section: Indirect Methodsmentioning

confidence: 99%

A multi-scale fully convolutional network for semantic labeling of 3D point clouds

Yousefhussien

Kelbe

Ientilucci

et al. 2018

ISPRS Journal of Photogrammetry and Remote Sensing

108

View full text Add to dashboard Cite

When classifying point clouds, a large amount of time is devoted to the process of engineering a reliable set of features which are then passed to a classifier of choice. Generally, such features -usually derived from the 3Dcovariance matrix -are computed using the surrounding neighborhood of points. While these features capture local information, the process is usually time-consuming, and requires the application at multiple scales combined with contextual methods in order to adequately describe the diversity of objects within a scene. In this paper we present a 1D-fully convolutional network that consumes terrain-normalized points directly with the corresponding spectral data, if available, to generate point-wise labeling while implicitly learning contextual features in an end-to-end fashion. Our method uses only the 3D-coordinates and three corresponding spectral features for each point. Spectral features may either be extracted from 2D-georeferenced images, as shown here for Light Detection and Ranging (LiDAR) point clouds, or extracted directly for passive-derived point clouds, i.e. from muliple-view imagery. We train our network by splitting the data into square regions, and use a pooling layer that respects the permutation-invariance of the input points. Evaluated using the ISPRS 3D Semantic Labeling Contest, our method scored second place with an overall accuracy of 81.6%. We ranked third place with a mean F1-score of 63.32%, surpassing the F1-score of the method with highest accuracy by 1.69%. In addition to labeling 3D-point clouds, we also show that our method can be easily extended to 2D-semantic segmentation tasks, with promising initial results.

show abstract

Section: Indirect Methodsmentioning

confidence: 99%

A multi-scale fully convolutional network for semantic labeling of 3D point clouds

Yousefhussien

Kelbe

Ientilucci

et al. 2018

ISPRS Journal of Photogrammetry and Remote Sensing

108

View full text Add to dashboard Cite

show abstract

“…In [5], color, texture and saliency features are used for mean-shift tracking algorithm. The previous works [3], [4], [5] only use low-level visual saliency information which is opposite to our intention to exploit semantically higher level information. Moreover, in [5] the features are combined in a straightforward manner (concatenation).…”

Section: Introductionmentioning

confidence: 90%

“…These features were selected from predefined hand-crafted features (such as SIFT). In [4], the method extracts motion, appearance and location saliency maps and predicts the next location of objects via multiplication of all saliency maps with the input initial tracking bounding box(es). In [5], color, texture and saliency features are used for mean-shift tracking algorithm.…”

Section: Introductionmentioning

confidence: 99%

Saliency Enhanced Robust Visual Tracking

Avytekin

Cricri

Aksu

2018

2018 7th European Workshop on Visual Information Processing (EUVIP)

View full text Add to dashboard Cite

Discrete correlation filter (DCF) based trackers have shown considerable success in visual object tracking. These trackers often make use of low to mid level features such as histogram of gradients (HoG) and mid-layer activations from convolution neural networks (CNNs). We argue that including semantically higher level information to the tracked features may provide further robustness to challenging cases such as viewpoint changes. Deep salient object detection is one example of such high level features, as it make use of semantic information to highlight the important regions in the given scene. In this work, we propose an improvement over DCF based trackers by combining saliency based and other features based filter responses. This combination is performed with an adaptive weight on the saliency based filter responses, which is automatically selected according to the temporal consistency of visual saliency. We show that our method consistently improves a baseline DCF based tracker especially in challenging cases and performs superior to the state-of-the-art. Our improved tracker operates at 9.3 fps, introducing a small computational burden over the baseline which operates at 11 fps.

show abstract

“…Zhang et al [35] proposed an online discriminative feature selection method that couples the classifier score with the importance of samples, leading to a more robust and efficient tracker. Yousefhussien, Browning and Kanan [32] propose the Smooth Pursuit Tracking (SPT) algorithm which uses three kinds of saliency maps: appearance, location, and motion to track objects under all conditions, including long-term occlusion. Elliethy and Sharma [6] proposed an innovative approach to register captured WAMI frames with vector road map data(Open Street Map [15]) and track vehicles within those registered frames simultaneously, leading to efficient results.…”

Section: Rgb Aerial Imagerymentioning

confidence: 99%

“…Aerial imagery typically yields a relatively small number of pixels on a target (roughly 20-100 pixels) and comparatively lower sampling rates (1-2 Hz) than common traditional rates degrading the performance of apperance-based trackingby-detection methods. Different sensor modalities such as infrared [2,8,3], Wide Area Motion Imagery (WAMI) [22,4,5] and RGB [23,35,32,6] have all shown the potential to improve tracking, however most of them perform poorly to achieve persistent tracking in real-time due to the unique challenges posed by aerial imagery or dependency on external sources of information (e.g. road map information) for achieving optimum results.…”

Section: Introductionmentioning

confidence: 99%

Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps

Uzkent

Rangnekar

Hoffman

2017

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

View full text Add to dashboard Cite

Hyperspectral cameras provide unique spectral signatures that can be used to solve surveillance tasks. This paper proposes a novel real-time hyperspectral likelihood maps-aided tracking method (HLT) inspired by an adaptive hyperspectral sensor. We focus on the target detection part of a tracking system and remove the necessity to build any offline classifiers and tune large amount of hyperparameters, instead learning a generative target model in an online manner for hyperspectral channels ranging from visible to infrared wavelengths. The key idea is that our adaptive fusion method can combine likelihood maps from multiple bands of hyperspectral imagery into one single more distinctive representation increasing the margin between mean value of foreground and background pixels in the fused map. Experimental results show that the HLT not only outperforms all established fusion methods but is on par with the current state-of-the-art hyperspectral target tracking frameworks.

show abstract

Online tracking using saliency

Cited by 7 publications

References 42 publications

A multi-scale fully convolutional network for semantic labeling of 3D point clouds

A multi-scale fully convolutional network for semantic labeling of 3D point clouds

Saliency Enhanced Robust Visual Tracking

Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps

Contact Info

Product

Resources

About