Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking

Ning, Guanghan; Zhi, Zhang; Huang, Chen; He, Zhihai; Ren, Xiaobo; Wang, Haohong

doi:10.48550/arxiv.1607.05781

Cited by 10 publications

(7 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In vision-based convoying, the system may lose sight of the object momentarily due to occlusion or lighting changes, and thus lose track of its leading agent. In an attempt to address this problem, we use recurrent layers stacked on top of our ReducedYOLO architecture, similarly to [26]. In their work, Ning et al use the last layer of features output by the YOLO network for n frames (concatenated with the YOLO bounding box prediction which has the highest IOU with the ground truth) and feed them to single forward Long-Term Short-Term Memory Network (LSTM).…”

Section: B Recurrent Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Underwater multi-robot convoying using visual tracking by detection

Shkurti

Chang

Henderson

et al. 2017

2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

Abstract-We present a robust multi-robot convoying approach that relies on visual detection of the leading agent, thus enabling target following in unstructured 3-D environments. Our method is based on the idea of tracking-by-detection, which interleaves efficient model-based object detection with temporal filtering of image-based bounding box estimation. This approach has the important advantage of mitigating tracking drift (i.e. drifting away from the target object), which is a common symptom of model-free trackers and is detrimental to sustained convoying in practice. To illustrate our solution, we collected extensive footage of an underwater robot in ocean settings, and hand-annotated its location in each frame. Based on this dataset, we present an empirical comparison of multiple tracker variants, including the use of several convolutional neural networks, both with and without recurrent connections, as well as frequency-based model-free trackers. We also demonstrate the practicality of this tracking-by-detection strategy in real-world scenarios by successfully controlling a legged underwater robot in five degrees of freedom to follow another robot's independent motion.

show abstract

Section: B Recurrent Methodsmentioning

confidence: 99%

“…2: Overview of our Recurrent ReducedYOLO (RROLO) architecture. The original ROLO work [26] did not use bidirectional, dense layers, or multiple LSTM cells in their experiments.…”

Section: B Recurrent Methodsmentioning

confidence: 99%

Underwater multi-robot convoying using visual tracking by detection

Shkurti

Chang

Henderson

et al. 2017

2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

show abstract

“…Another related work to ours is [21]. They proposed a spatially supervised recurrent convolutional neural network in which a YOLO network [22] is applied on each frame to produce object detections and a recurrent neural network is used to directly regress YOLO detections.…”

Section: Visual Object Trackingmentioning

confidence: 99%

Deep Reinforcement Learning for Visual Object Tracking in Videos

Zhang,

Maei,

Wang

et al. 2017

Preprint

View full text Add to dashboard Cite

In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame. An important insight is that the tracking problem can be considered as a sequential decision-making process and historical semantics encode highly relevant information for future decisions. Based on this intuition, we formulate our model as a recurrent convolutional neural network agent that interacts with a video overtime, and our model can be trained with reinforcement learning (RL) algorithms to learn good tracking policies that pay attention to continuous, inter-frame correlation and maximize tracking performance in the long run. The proposed tracking algorithm achieves state-of-the-art performance in an existing tracking benchmark and operates at frame-rates faster than realtime. To the best of our knowledge, our tracker is the first neural-network tracker that combines convolutional and recurrent networks with RL algorithms.

show abstract

“…More recently, as deep learning models have gained in popularity, solutions such as the gradient normal clipping strategy (Pascanu et al, 2013) have eased the overall implementation burden of RNNs. As RNNs have become more manageable from an estimation perspective, they have increasingly been used to model complicated sequential forecasting problems such as visual object tracking (Ning et al, 2016), speech recognition (Yildiz et al, 2013), and text generation (Graves, 2013), just to name a few. Simultaneously, RNNs have also seen a rise in usage in the dynamic systems literature due to their ability to replicate complex attractor dynamics that are often present in chaotic systems (Jaeger, 2001).…”

Section: Introductionmentioning

confidence: 99%

Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data

McDermott

Wikle

2019

Entropy

View full text Add to dashboard Cite

Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. More recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data.The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications

show abstract

Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking

Cited by 10 publications

References 14 publications

Underwater multi-robot convoying using visual tracking by detection

Underwater multi-robot convoying using visual tracking by detection

Deep Reinforcement Learning for Visual Object Tracking in Videos

Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data

Contact Info

Product

Resources

About