2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00523
|View full text |Cite
|
Sign up to set email alerts
|

Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection

Abstract: Multispectral pedestrian detection has shown great advantages under poor illumination conditions, since the thermal modality provides complementary information for the color image. However, real multispectral data suffers from the position shift problem, i.e. the color-thermal image pairs are not strictly aligned, making one object has different positions in different modalities. In deep learning based methods, this problem makes it difficult to fuse the feature maps from both modalities and puzzles the CNN tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
140
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 148 publications
(141 citation statements)
references
References 57 publications
(92 reference statements)
0
140
1
Order By: Relevance
“…We evaluated the proposed fusion method on the KAIST testing dataset in the reasonable setting and in all settings in comparison with ACF+T+THOG [ 14 ], Halfway Fusion [ 22 ], Fusion RPN+BDT [ 30 ], IAF R-CNN [ 28 ], IATDNN+IASS [ 27 ], CIAN [ 37 ], MSDS-RCNN [ 17 ], ARCNN [ 38 ], MBNet [ 39 ], and FusionCSPNet [ 18 ]. Among these detection methods, FusionCSPNet and our method were one-stage methods, and the rest were two-stage methods.…”
Section: Methodsmentioning
confidence: 99%
“…We evaluated the proposed fusion method on the KAIST testing dataset in the reasonable setting and in all settings in comparison with ACF+T+THOG [ 14 ], Halfway Fusion [ 22 ], Fusion RPN+BDT [ 30 ], IAF R-CNN [ 28 ], IATDNN+IASS [ 27 ], CIAN [ 37 ], MSDS-RCNN [ 17 ], ARCNN [ 38 ], MBNet [ 39 ], and FusionCSPNet [ 18 ]. Among these detection methods, FusionCSPNet and our method were one-stage methods, and the rest were two-stage methods.…”
Section: Methodsmentioning
confidence: 99%
“…Miss Rate (lower, better) All Day Night ACF [1] 47.32% 42.57% 56.17% Halfway Fusion [21] 25.75% 24.88% 26.59% Fusion RPN+BF [5] 18.29% 19.57% 16.27% IAF R-CNN [10] 15.73% 14.55% 18.26% IATDNN+IASS [9] 14.95% 14.67% 15.72% CIAN [7] 14.12% 14.77% 11.13% MSDS-RCNN [6] 11.34% 10.53% 12.94% AR-CNN [18] 9.34% 9.94% 8.38% MBNet [12] 8.13% 8.28% 7.86% Ours (full dataset) 8.86% 10.01% 6.77% Ours (10.26% of data) 9.32% 10.13% 7.70% [3] 39.7% 36.1% 36.8% FuseNet [14] 45.6% 41.0% 43.9% RTFNet [13] 53.2% 45.8% 54.8% Ours (full dataset) 53.6% 46.8% 53.3% Ours (17.99% of data) 51.0% 46.6% 48.9% Table 3. mIoU comparisons on TOKYO Dataset.…”
Section: Methodsmentioning
confidence: 99%
“…This well-known multispectral dataset is built for the pedestrian detection task. In order to tackle the misalignment problem between visible-thermal image pairs, [18] proposes the "paired" annotations by separately relabelling pedestrians for each modality. We remove unpaired images according to the matching of visible and thermal annotations, thus keeping 11,695 images for training.…”
Section: Datasets Kaist Dataset [1]mentioning
confidence: 99%
“…An accumulated probability fusion layer was also introduced to combine probabilities from different modalities at the proposal-level. Taking the position shift problem of multispectral data into consideration, Zhang et al [54] proposed a region feature alignment module to capture position shifts and a confidence-aware fusion method to merge both modalities.…”
Section: Related Workmentioning
confidence: 99%