Proceedings of the 27th ACM International Conference on Multimedia 2019
DOI: 10.1145/3343031.3350928
|View full text |Cite
|
Sign up to set email alerts
|

Dense Feature Aggregation and Pruning for RGBT Tracking

Abstract: How to perform effective information fusion of different modalities is a core factor in boosting the performance of RGBT tracking. This paper presents a novel deep fusion algorithm based on the representations from an end-to-end trained convolutional neural network. To deploy the complementarity of features of all layers, we propose a recursive strategy to densely aggregate these features that yield robust representations of target objects in each modality. In different modalities, we propose to prune the dens… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
116
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 164 publications
(116 citation statements)
references
References 25 publications
0
116
0
Order By: Relevance
“…Along with the improvement of the above-mentioned methods, the corresponding performance of RGB-T trackers has been continuously upgraded. Representative works [2,[6][7][8][9][10][11] are based on sparse representation, correlation filtering, and deep learning. Li et al [2] proposed a cross-modal manifold sorting algorithm, which solved the influence of background clutter during the tracking process.…”
Section: Rgb-t Trackingmentioning
confidence: 99%
See 1 more Smart Citation
“…Along with the improvement of the above-mentioned methods, the corresponding performance of RGB-T trackers has been continuously upgraded. Representative works [2,[6][7][8][9][10][11] are based on sparse representation, correlation filtering, and deep learning. Li et al [2] proposed a cross-modal manifold sorting algorithm, which solved the influence of background clutter during the tracking process.…”
Section: Rgb-t Trackingmentioning
confidence: 99%
“…The object tracking problem of RGB-T is an extension of the traditional visual tracking task, that is, given the initial position state of the target, the RGB and thermal infrared image are comprehensively used to continuously estimate the target position in subsequent scenes. In recent years, several works have been carried out on this RGB-T tracking research, and representative approaches are roughly divided into two categories: Tracking based on traditional manual features [1][2][3][4][5][6][7] and tracking based on deep learning [8][9][10][11].The former category is mostly based on theoretical frameworks such as sparse representation [2][3][4][5], correlation filtering [6], Bayesian filtering [7], and uses hand-crafted textures or local features to construct cross-modal object appearance model and state estimation methods. The latter class builds the effective model of modeling targets from massive data by exploring the powerful feature representation capabilities of deep neural networks.…”
Section: Introductionmentioning
confidence: 99%
“…In particular, our tracker outperforms DAT, RT-MDNet and ECO with 12.0%/11%, 14.6%/11.5% and 12.1%/9.7% in PR/SR, respectively. ter than DAPNet [40] 0.9%/2.1% in PR/SR, and our algorithm runs 6 times faster. The overall promising performance of our method can be explained by the fact that FANet makes fully use of hierarchical deep features and RGBT information to well handle the challenges of significant appearance changes and adverse environmental conditions.…”
Section: B Evaluation On Gtot Datasetmentioning
confidence: 86%
“…Runtime analysis. Finally, we present the runtime of our FANet against the state-of-the-art trackers, MD-Net [16]+RGBT, MANet [49], DAPNet [40], CMR [9], SGT [7] with their tracking performance on the RGBT234 dataset in Table IV. Our implementation is on the platform of PyTorch0.41 with 2.1 GHz Intel(R) Xeon(R) CPU E5-2620 and NVIDIA GeForce GTX 2080Ti GPU, and the average tracking speed is 19 FPS.…”
Section: Analysis Of Our Networkmentioning
confidence: 99%
“…Visual tracking [4,11,17,39,63] is still one of the most active and important research areas in computer vision, which aims to predict the location of an arbitrary target in the consecutive frames precisely by a given initial location (e,g., a bounding box annotation). Although a variety of visual tracking models [6,31,56,64] have been developed, visual tracking is still an on-going and challenging task due to large variations on occlusion, obscureness, fast motions and deformation (i.e., some common challenges as shown in [51].…”
Section: Introductionmentioning
confidence: 99%