Initial Matting-Guided Visual Tracking With Siamese Network

Qin, Xiaofei; Zhang, Fan

doi:10.1109/access.2019.2907282

Cited by 10 publications

(13 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…FICFNet [42] computes channel attention on both Siamese pipeline branches to weight feature channels. IMG-Siam [43] fuses the target foreground using channel attention and the super pixel based matting algorithm to provide enhanced target appearance with structural information. FlowTrack [44] uses temporal attention to capture target temporal information.…”

Section: Attention Based Trackersmentioning

confidence: 99%

“…However, during tracking, we required only a pre-trained model and the first frame of the video to track the sequence. On the other hand, the existing attentionbased trackers including MemTrack [40] and MemDTC [45] maintain previous memory for the tracked object and update accordingly; IMG-Siam [43] uses super-pixel based mating to extract the target foreground; FlowTrack [44] utilizes the historical frames to model update; FICFNet [42] integrates attention module to both target and search branches.…”

Section: B Stacked Channel-spatial Attentionmentioning

confidence: 99%

“…where σ represents the usual sigmoid function f (x) = for spotting the target location that provides a good complementary to channel attention. Previously, Qin et al [54] constructed a spatial mask using super-pixels to exploit target representation. Li et al [53] utilized global max pooling to encode the spatial attention in their model.…”

Section: ) Channel Attentionmentioning

confidence: 99%

See 2 more Smart Citations

Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning

2020

View full text Add to dashboard Cite

Template based learning, particularly Siamese networks, has recently become popular due to balancing accuracy and speed. However, preserving tracker robustness against challenging scenarios with real-time speed is a primary concern for visual object tracking. Siamese trackers confront difficulties handling target appearance changes continually due to less discrimination ability learning between target and background information. This paper presents stacked channel-spatial attention within Siamese networks to improve tracker robustness without sacrificing fast-tracking speed. The proposed channel attention strengthens target-specific channels increasing their weight while reducing the importance of irrelevant channels with lower weights. Spatial attention is focusing on the most informative region of the target feature map. We integrate the proposed channel and spatial attention modules to enhance tracking performance with end-to-end learning. The proposed tracking framework learns what and where to highlight important target information for efficient tracking. Experimental results on widely used OTB100, OTB50, VOT2016, VOT2017/18, TC-128, and UAV123 benchmarks verified the proposed tracker achieved outstanding performance compared with state-of-the-art trackers. INDEX TERMS Deep learning, Siamese architecture, stacked channel-spatial attention, visual object tracking.

show abstract

Section: Attention Based Trackersmentioning

confidence: 99%

Section: B Stacked Channel-spatial Attentionmentioning

confidence: 99%

Section: ) Channel Attentionmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning

2020

View full text Add to dashboard Cite

show abstract

“…In recent years, the success of deep learning in object tracking has led to its supplanting traditional methods [16] in high-performance applications [3], [17]. It is difficult to train a network to track a target from scratch, while the application of a Siamese network promises improvements.…”

Section: Related Work a Target Tracking By Siamese Networkmentioning

confidence: 99%

Distractor-Aware Visual Tracking by Online Siamese Network

Zha

Qiu

et al. 2019

IEEE Access

View full text Add to dashboard Cite

The idea of most trackers based on Siamese network is off-line training and online tracking. In fact, online tracking is conducted in terms of deep features, which are extracted from the predefined network trained on a large amount of data off-line. However, these features are the general representation for similar objects, and therefore, their discrimination ability is not enough to identify the current tracking target, particularly distractors, from the background. To tackle this problem, we propose to update the features extracted by a Siamese network online. These features can fit the target variations when tracking is on-thefly. Especially, we extract the common features from the shallow convolutional layers trained off-line, and then, they are employed as inputs of the deep convolutional layers to learn the special features of the current target online. Besides, an integrated updating strategy is proposed to accelerate network convergence. It is beneficial to enhance the discrimination ability of the learned features to identify the current target from the background and distractors. We conducted abundant experiments on the OTB2015 and VOT2016 databases. And the results demonstrate that our tracker effectively improves the baseline algorithm and performs favorably against most of the state-of-the-art trackers in the comparison of accuracy and robustness.INDEX TERMS Target tracking, Siamese network, offline training, online tracking.

show abstract

“…Visual object tracking is one of the most basic problems in the application of human-computer interaction, visual analysis and auxiliary drive systems. Its purpose is to accurately estimate the position and scale of the object in the subsequent frame, according to the bounding box given in the first frame [1]. The appearance difference caused by illumination, deformation, occlusion, rotation and motion is a great challenge.…”

Section: Introductionmentioning

confidence: 99%

ACSiamRPN: Adaptive Context Sampling for Visual Object Tracking

et al. 2020

Self Cite

View full text Add to dashboard Cite

In visual object tracking fields, the Siamese network tracker, based on the region proposal network (SiamRPN), has achieved promising tracking effects, both in speed and accuracy. However, it did not consider the relationship and differences between the long-range context information of various objects. In this paper, we add a global context block (GC block), which is lightweight and can effectively model long-range dependency, to the Siamese network part of SiamRPN so that the object tracker can better understand the tracking scene. At the same time, we propose a novel convolution module, called a cropping-inside selective kernel block (CiSK block), based on selective kernel convolution (SK convolution, a module proposed in selective kernel networks) and use it in the region proposal network (RPN) part of SiamRPN, which can adaptively adjust the size of the receptive field for different types of objects. We make two improvements to SK convolution in the CiSK block. The first improvement is that in the fusion step of SK convolution, we use both global average pooling (GAP) and global maximum pooling (GMP) to enhance global information embedding. The second improvement is that after the selection step of SK convolution, we crop out the outermost pixels of features to reduce the impact of padding operations. The experiment results show that on the OTB100 benchmark, we achieved an accuracy of 0.857 and a success rate of 0.643. On the VOT2016 and VOT2019 benchmarks, we achieved expected average overlap (EAO) scores of 0.394 and 0.240, respectively.

show abstract

Initial Matting-Guided Visual Tracking With Siamese Network

Cited by 10 publications

References 34 publications

Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning

Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning

Distractor-Aware Visual Tracking by Online Siamese Network

ACSiamRPN: Adaptive Context Sampling for Visual Object Tracking

Contact Info

Product

Resources

About