2020
DOI: 10.1109/access.2020.2997917
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning

Abstract: Template based learning, particularly Siamese networks, has recently become popular due to balancing accuracy and speed. However, preserving tracker robustness against challenging scenarios with real-time speed is a primary concern for visual object tracking. Siamese trackers confront difficulties handling target appearance changes continually due to less discrimination ability learning between target and background information. This paper presents stacked channel-spatial attention within Siamese networks to i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

3
6

Authors

Journals

citations
Cited by 23 publications
(13 citation statements)
references
References 74 publications
0
13
0
Order By: Relevance
“…We noticed that the proposed method weakly performs in the low-resolution sequences depicted in Figures 4 and 5 compared with other trackers. We also observed that our method has a deficiency on TC128 benchmark against the SCSAtt [69] tracker that introduced stacked channel-spatial attention based Siamese network for tracking. SCSAtt utilized the spatial feature map to locate the target besides focusing on the object's informative part.…”
Section: Discussionmentioning
confidence: 85%
See 1 more Smart Citation
“…We noticed that the proposed method weakly performs in the low-resolution sequences depicted in Figures 4 and 5 compared with other trackers. We also observed that our method has a deficiency on TC128 benchmark against the SCSAtt [69] tracker that introduced stacked channel-spatial attention based Siamese network for tracking. SCSAtt utilized the spatial feature map to locate the target besides focusing on the object's informative part.…”
Section: Discussionmentioning
confidence: 85%
“…Similar to the OTB benchmark evaluation, we adopt success and precision plots to compare with the state-of-the-art trackers. We compare our tracker with SCSAtt [69], MEEM [63], SRDCF [61], MUSTER [65], SAMF [64], KCF [68], DSST [66], truck [67], and CSK [70] on this benchmark.…”
Section: Experiments On Tc128 Benchmarkmentioning
confidence: 99%
“…A Siamese network comprises of two parallel Convolutional Neural Networks (CNN) streams that are used to learn the similarity between input images in embedded space and to fuse them to produce an output [ 46 ]. Owing to their inherent characteristics such as accuracy and speed, Siamese networks are popular in the visual tracking community [ 10 , 15 , 16 , 17 , 47 ]. A SiameseFC [ 15 ] extracts input image features using an embedded CNN model and fuses them by using a correlation layer, to generate a response map.…”
Section: Related Workmentioning
confidence: 99%
“…Reference [ 90 ] proposed spatial attention (SCSAtt), which ensured the model’s speed and increases its robustness. SCSAtt uses weight allocation to highlight the importance of the feature of the channel—namely, the channel attention module—and uses the spatial attention module to highlight the area with the most information on the feature diagram to determine the target location.…”
Section: Target Tracking Algorithm Based On a Deep Learning Networmentioning
confidence: 99%