2023
DOI: 10.1109/tmm.2022.3171688
|View full text |Cite
|
Sign up to set email alerts
|

RGBT Salient Object Detection: A Large-Scale Dataset and Benchmark

Abstract: Recently, many breakthroughs are made in the field of Video Object Detection (VOD), but the performance is still limited due to the imaging limitations of RGB sensors in adverse illumination conditions. To alleviate this issue, this work introduces a new computer vision task called RGB-thermal (RGBT) VOD by introducing the thermal modality that is insensitive to adverse illumination conditions. To promote the research and development of RGBT VOD, we design a novel Erasurebased Interaction Network (EINet) and e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
57
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 90 publications
(57 citation statements)
references
References 100 publications
0
57
0
Order By: Relevance
“…To verify the superiority of the proposed model, we compare it with twenty state-of-the-art (SOTA) SOD methods, which can be further classified into four categories: (1) three RGB SOD methods including CPD [31], BASNet [33] , and EGNet [32]; (2) six RGB-D SOD methods including DMRA [42], MMCI [93], JL-DCF [44], S2MA [39], DPANet [45], and CDINet [49]; (3) three traditional RGB-T SOD methods including SGDL [65], MTMR [63], and M3S-NIR [64]; (4) eight deep learning RGB-T SOD methods including ADF [81], MMNet [67], MIDD [66], APNet [70], ECFFNet [69], CSRNet [28], CGFNet [68], and MIA [41]. All RGB and RGB-D SOD models are retrained on the same RGB-T training dataset as our model for fair comparison.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…To verify the superiority of the proposed model, we compare it with twenty state-of-the-art (SOTA) SOD methods, which can be further classified into four categories: (1) three RGB SOD methods including CPD [31], BASNet [33] , and EGNet [32]; (2) six RGB-D SOD methods including DMRA [42], MMCI [93], JL-DCF [44], S2MA [39], DPANet [45], and CDINet [49]; (3) three traditional RGB-T SOD methods including SGDL [65], MTMR [63], and M3S-NIR [64]; (4) eight deep learning RGB-T SOD methods including ADF [81], MMNet [67], MIDD [66], APNet [70], ECFFNet [69], CSRNet [28], CGFNet [68], and MIA [41]. All RGB and RGB-D SOD models are retrained on the same RGB-T training dataset as our model for fair comparison.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…This is because we regulate the interaction between the RGB image and the thermal image through the learned global illuminance score, supplement the semantic content for each layer of the thermal image branch in the encoding stage, and take full advantage of the valuable information that the thermal modality can provide in the decoding stage. subsubsectionComputational Complexity To compare the complexity of different algorithms, we select four open-source RGB-T SOD models for comparison, including CGFNet [68], APNet [70], ADF [81], MIDD [66]. Table II shows the FLOPs (Floating Point Operations) and maximum F-measure of different algorithms on the VT1000 dataset.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…RGB-D SOD dataset includes NLPR [109] with 1,000 images, NJU2K [110] with 1,985 images, STERE [111] with 1,000 images, DES [112] with 135 images, SIP [59] with 929 images, and DUT [113] with 1,200 images. RGB-T SOD dataset includes VT821 [72], VT1000 [74], and VT5000 [114]. LF SOD dataset includes LFSD [81] with 100 light field data, HFUT-Lytro [115] with 255 samples, DUTLF-FS [84] with 1,462 light field images.…”
Section: Experiments a Datasetsmentioning
confidence: 99%
“…Similar to previous approaches of combining thermal images with RGB images, Zhengzheng et al [ 62 ] propose fusing RGB images with thermal images to detect objects in adverse conditions. A two-stream convolution neural network architecture generates features from RGB and thermal images.…”
Section: Related Workmentioning
confidence: 99%