2020
DOI: 10.3390/s20020393
|View full text |Cite
|
Sign up to set email alerts
|

Object Tracking in RGB-T Videos Using Modal-Aware Attention Network and Competitive Learning

Abstract: Object tracking in RGB-thermal (RGB-T) videos is increasingly used in many fields due to the all-weather and all-day working capability of the dual-modality imaging system, as well as the rapid development of low-cost and miniaturized infrared camera technology. However, it is still very challenging to effectively fuse dual-modality information to build a robust RGB-T tracker. In this paper, an RGB-T object tracking algorithm based on a modal-aware attention network and competitive learning (MaCNet) is propose… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
54
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 107 publications
(55 citation statements)
references
References 39 publications
1
54
0
Order By: Relevance
“…Therefore, this work can act like a reference for researchers who are interested in RGBT tracking. VOT-RGBT2019 [43] VOT-RGBT2020 [44] Trackers Accuracy(↑) Robustness(↑) EAO(↑) Accuracy(↑) Robustness(↑) EAO(↑) mfDiMP [39] 0.6019 0.8036 0.3879 0.6380 0.7930 0.3800 MANet [18] 0.5823 0.7010 0.3463 ---SiamFT [33] 0.6300 0.6390 0.3100 ---MaCNet [25] 0.5451 0.5914 0.3052 ---JMMAC [42] 0.6649 0.8211 0.4826 0.6620 0.8180 0.4200 MANet++ [22] 0.5092 0.5379 0.2716 ---TFNet [27] 0.4617 0.5936 0.2878 ---FANet [14] 0.4724 0.5078 0.2465 ---ADRNet [29] 0.6218 0.7657 0.3959 ---SiamCDA [36] 0.6820 0.7570 0.4240 ---…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, this work can act like a reference for researchers who are interested in RGBT tracking. VOT-RGBT2019 [43] VOT-RGBT2020 [44] Trackers Accuracy(↑) Robustness(↑) EAO(↑) Accuracy(↑) Robustness(↑) EAO(↑) mfDiMP [39] 0.6019 0.8036 0.3879 0.6380 0.7930 0.3800 MANet [18] 0.5823 0.7010 0.3463 ---SiamFT [33] 0.6300 0.6390 0.3100 ---MaCNet [25] 0.5451 0.5914 0.3052 ---JMMAC [42] 0.6649 0.8211 0.4826 0.6620 0.8180 0.4200 MANet++ [22] 0.5092 0.5379 0.2716 ---TFNet [27] 0.4617 0.5936 0.2878 ---FANet [14] 0.4724 0.5078 0.2465 ---ADRNet [29] 0.6218 0.7657 0.3959 ---SiamCDA [36] 0.6820 0.7570 0.4240 ---…”
Section: Discussionmentioning
confidence: 99%
“…However, to keep the modalityspecific characteristics being discriminative, the features from RGB and TIR modalities are also retained in some methods. Specifically, MacNet [25] learns the fusion weights through the independent modal-aware attention network and competitive learning [26], for which the features of RGB and TIR modalities are reserved, leverages the capacity of results from single modality and the modality-fused branch. TFNet [27] deploys a trident branch architecture and each branch is specific for the RGB, TIR and fused features.…”
Section: A Mdnet-based Rgbt Trackersmentioning
confidence: 99%
“…These methods have fast tracking speed, but are usually weak in representing low-resolution objects, which are common in RGBT tracking. The other main research stream is in MDNet frameworks [30], [31], [32], [7], [33], [34], [35], [36], [37], [38], which performs different fusion strategies to utilize complementary benefits of RGB and thermal data. Such kinds of methods receive robust tracking results but have low efficiency, and the tracking capacity is limited by MDNet which bases on VGG network.…”
Section: B Rgbt Tracking Methodsmentioning
confidence: 99%
“…All the deeplearning-based trackers mentioned above merely preserve the fused features or the features from each modality. To retain the original features after fusion, [43] performs multi-modal multilayer fusion with the weights computed from the original input images. Similarly, TFNet [44] uses a trident fusion network to better excavate the complementary characteristics of RGB and TIR modalities.…”
Section: A Cross-modal Fusion Mechanismsmentioning
confidence: 99%