2022
DOI: 10.1007/978-3-031-20077-9_30
|View full text |Cite
|
Sign up to set email alerts
|

Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…They introduced UA-CMDet, a method integrating visible, infrared, and bimodal fusion branches for detection. Yuan et al [38] developed TSFADet, aiming to mitigate the adverse effects of cross-modal misalignment by equalizing the discrepancies between two modal features. Zhang et al [39] proposed SuperYOLO, incorporating a cross-modal fusion module to extract supplementary information from the data.…”
Section: Multispectral Object Detection Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…They introduced UA-CMDet, a method integrating visible, infrared, and bimodal fusion branches for detection. Yuan et al [38] developed TSFADet, aiming to mitigate the adverse effects of cross-modal misalignment by equalizing the discrepancies between two modal features. Zhang et al [39] proposed SuperYOLO, incorporating a cross-modal fusion module to extract supplementary information from the data.…”
Section: Multispectral Object Detection Algorithmmentioning
confidence: 99%
“…The evaluation results, as presented in Table 4, demonstrate significant improvements achieved by GMD-YOLO. [35] RGB + IR ResNet -73.9 UA-CMDet [37] RGB + IR YOLO -64.0 ECISNet [36] RGB + IR ResNet -76.0 TSFADet [38] RGB + IR ResNet -73.0 GMD-YOLO RGB + IR YOLO 80.3 78.0…”
Section: Comparison Experiments 441 Experiments On the Dronevehicle D...mentioning
confidence: 99%
“…UA-CMDet [23], RISNet [24], and ECISNet [25] optimize the cross-modal mutual information utilization to improve the detection performance. TSFADet [26] aligns cross-modal objects from translation, scaling, and rotation aspects through a network.…”
Section: Introductionmentioning
confidence: 99%
“…Spatial misalignment: Since RGB and infrared sensors have different coordinate systems, fields of view, and sampling frequencies, pairs of RGB and infrared images usually are spatial misaligned [21], resulting in low-quality bounding boxes predicted by RGB and infrared fusion object detection, as shown in Figure 1a. Figure 1b indicates that the fusion of original RGB and infrared images via the image fusion algorithm [22] will take in a significantly misaligned ghost, which will also disturb localization.…”
Section: Introductionmentioning
confidence: 99%