2022
DOI: 10.1609/aaai.v36i3.20257
|View full text |Cite
|
Sign up to set email alerts
|

Self-Supervised Pretraining for RGB-D Salient Object Detection

Abstract: Existing CNNs-Based RGB-D salient object detection (SOD) networks are all required to be pretrained on the ImageNet to learn the hierarchy features which helps provide a good initialization. However, the collection and annotation of large-scale datasets are time-consuming and expensive. In this paper, we utilize self-supervised representation learning (SSL) to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation. Our pretext tasks require only a few and unlabeled RGB-D datas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 48 publications
(29 citation statements)
references
References 50 publications
0
29
0
Order By: Relevance
“…We compare the proposed DIN model with 17 state-of-the-art saliency models, including DANet [ 36 ], PGAR [ 61 ], CMWNet [ 62 ], ATSA [ 19 ], D3Net [ 55 ], DSA2F [ 20 ], DCF [ 63 ], HAINet [ 64 ], CDNet [ 65 ], CDINet [ 66 ], MSIRN [ 67 ], SSP [ 39 ], FCMNet [ 68 ], and PASNet [ 44 ]. For a fair comparison, saliency maps of the evaluated methods are provided by the authors, or obtained by implementing the codes.…”
Section: Results and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…We compare the proposed DIN model with 17 state-of-the-art saliency models, including DANet [ 36 ], PGAR [ 61 ], CMWNet [ 62 ], ATSA [ 19 ], D3Net [ 55 ], DSA2F [ 20 ], DCF [ 63 ], HAINet [ 64 ], CDNet [ 65 ], CDINet [ 66 ], MSIRN [ 67 ], SSP [ 39 ], FCMNet [ 68 ], and PASNet [ 44 ]. For a fair comparison, saliency maps of the evaluated methods are provided by the authors, or obtained by implementing the codes.…”
Section: Results and Analysismentioning
confidence: 99%
“…In this section, we demonstrate ablation studies, to verify the effectiveness of each main component of the proposed DIN model. [36], PGAR [61], CMWNet [62], ATSA [19], D3Net [55], DSA2F [20], DCF [63], HAINet [64], CDNet [65], CDINet [66], and SSP [39].…”
Section: Ablation Studiesmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed method was compared with 15 significant object detection methods proposed in recent years: 3DCNN [15], HAIN [20], DSAM [19], ICNet [16], JLDCF [14], UCNet [7], SSF [31], SM2A [32], CMW [17], PGAR [33], SSP [34], ASSR [13], DPAG [9], D3Net [6], DMRA [10], and DLRF [8].The results of the method for the four test datasets are listed in Table 1. As F max , S m , and E m become higher, the result improves.…”
Section: Comparison With the State-of-the-artmentioning
confidence: 99%
“…In contrast with RGB images, without being limited by the similar indiscernible appearance, Depth maps have a foreground object that is distinct from the background, which theoretically leads to better segmentation performance, compared to RGB images only. In fact, Depth contours are cleaner than RGB contours and tend to better describe the edge information of objects for better separation of objects [62]. In contrast, RGB images are frequently utilized for providing global information [14].…”
Section: Introductionmentioning
confidence: 99%