2020
DOI: 10.1016/j.ins.2019.12.084
|View full text |Cite
|
Sign up to set email alerts
|

Learning reinforced attentional representation for end-to-end visual tracking

Abstract: Despite the fact that tremendous advances have been made by numerous recent tracking approaches in the last decade, how to achieve high-performance visual tracking is still an open problem. In this paper, we propose an end-to-end network model to learn reinforced attentional representation for accurate target object discrimination and localization. We utilize a novel hierarchical attentional module with long short-term memory and multi-layer perceptrons to leverage both inter-and intra-frame attention to effec… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
25
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
10

Relationship

1
9

Authors

Journals

citations
Cited by 94 publications
(25 citation statements)
references
References 52 publications
(138 reference statements)
0
25
0
Order By: Relevance
“…Inspired by the SiamRPN, DaSiamRPN [31] improved the discrimination of the tracker by adding hard negative data in the training process. RAR [42] uses LSTM to integrate the DCF framework as a correlation layer into the Siamese network. Advanced Siamese networks, such as the SiamRPN++ [29], SiamMask [43] and SiamDW [41], optimized the architecture by using modern deep networks.…”
Section: A Siamese Network Based Visual Trackersmentioning
confidence: 99%
“…Inspired by the SiamRPN, DaSiamRPN [31] improved the discrimination of the tracker by adding hard negative data in the training process. RAR [42] uses LSTM to integrate the DCF framework as a correlation layer into the Siamese network. Advanced Siamese networks, such as the SiamRPN++ [29], SiamMask [43] and SiamDW [41], optimized the architecture by using modern deep networks.…”
Section: A Siamese Network Based Visual Trackersmentioning
confidence: 99%
“…To cope with this, we propose crossresolution features, operating on high-and low-resolution input images, to integrate features from multiple abstraction levels with low overhead in network complexity and with high computational efficiency. Existing works on Siamese ConvNets have been promising in utilizing parallel network backbones [17,18]. Third, mobile inverted bottleneck convolution (MBConv) [38] with built-in squeeze-andexcitation (SE) [22] and Swish activation [37] integrated in EfficientNets has proven more accurate in image classification tasks [47,48] than regular convolutions [21,23,45], while substantially reducing the computational costs [47].…”
Section: Related Workmentioning
confidence: 99%
“…Automatic number plate recognition (ANPR) systems have become a very important tool in many surveilling applications over the past few decades. They are often used as a surveillance technique to identify licence plates of vehicles and are very useful for security systems, highway road tolling systems, traffic sign systems, tracking, and parking management systems [1][2][3][4][5]. The existing systems often work under some standard conditions, such as low-high lighting, rain, and limited day-night lighting.…”
Section: Introductionmentioning
confidence: 99%