RGB-D Salient Object Detection: A Survey

Zhou, Tao; Fan, Deng-Ping; Cheng, Ming-Ming; Shen, Jianbing; Shao, Ling

doi:10.48550/arxiv.2008.00230

Cited by 4 publications

(3 citation statements)

References 90 publications

(365 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As in Fig. 2 b), existing RGB-D SOD models mainly rely on extracting salient features from RGB image and depth map respectively, and then fuse them in the early or late network stages [46]. Following this trend, earlier work [3] proposes to concatenate RGB-D pairs as 4-channel inputs for salient object detection.…”

Section: Related Workmentioning

confidence: 99%

RGB-D Salient Object Detection with Ubiquitous Target Awareness

Zhao,

et al. 2021

Preprint

View full text Add to dashboard Cite

Conventional RGB-D salient object detection methods aim to leverage depth as complementary information to find the salient regions in both modalities. However, the salient object detection results heavily rely on the quality of captured depth data which sometimes are unavailable. In this work, we make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework. This framework only relies on RGB data in the testing phase, utilizing captured depth data as supervision for representation learning. To construct our framework as well as achieving accurate salient detection results, we propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task: 1) a depth awareness module to excavate depth information and to mine ambiguous regions via adaptive depth-error weights, 2) a spatial-aware cross-modal interaction and a channel-aware cross-level interaction, exploiting the low-level boundary cues and amplifying high-level salient channels, and 3) a gated multiscale predictor module to perceive the object saliency in different contextual scales. Besides its high performance, our proposed UTA network is depth-free for inference and runs in realtime with 43 FPS. Experimental evidence demonstrates that our proposed network not only surpasses the state-of-the-art methods on five public RGB-D SOD benchmarks by a large margin, but also verifies its extensibility on five public RGB SOD benchmarks.

show abstract

Section: Related Workmentioning

confidence: 99%

RGB-D Salient Object Detection with Ubiquitous Target Awareness

Zhao,

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…[15] introduced a depth distiller to transfer the depth knowledge from the depth stream to the RGB stream to achieve a lightweight architecture without use of depth data at test time. A comprehensive survey can be found in [44].…”

Section: Rgb-d Saliency Detectionmentioning

confidence: 99%

Uncertainty Inspired RGB-D Saliency Detection

Zhang

Fan

Dai

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

We propose the first stochastic framework to employ uncertainty for RGB-D saliency detection by learning from the data labeling process. Existing RGB-D saliency detection models treat this task as a point estimation problem by predicting a single saliency map following a deterministic learning pipeline. We argue that, however, the deterministic solution is relatively ill-posed. Inspired by the saliency data labeling process, we propose a generative architecture to achieve probabilistic RGB-D saliency detection which utilizes a latent variable to model the labeling variations. Our framework includes two main models: 1) a generator model, which maps the input image and latent variable to stochastic saliency prediction, and 2) an inference model, which gradually updates the latent variable by sampling it from the true or approximate posterior distribution. The generator model is an encoder-decoder saliency network. To infer the latent variable, we introduce two different solutions: i) a Conditional Variational Auto-encoder with an extra encoder to approximate the posterior distribution of the latent variable; and ii) an Alternating Back-Propagation technique, which directly samples the latent variable from the true posterior distribution. Qualitative and quantitative results on six challenging RGB-D benchmark datasets show our approach's superior performance in learning the distribution of saliency maps. The source code is publicly available via our project page: https://github.com/JingZhang617/UCNet.

show abstract

“…Discussing these works in detail is beyond the scope of this article. Please refer to the online benchmark (http://dpfan.net/d3netbenchmark/) and the latest survey [76] for more details.…”

Section: • Deep Modelsmentioning

confidence: 99%

Bifurcated backbone strategy for RGB-D salient object detection

Zhai,

Fan,

Yang

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Multi-level feature fusion is a fundamental topic in computer vision. It has been exploited to detect, segment and classify objects at various scales. When multi-level features meet multi-modal cues, the optimal feature aggregation and multi-modal learning strategy become a hot potato. In this paper, we leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to devise a novel cascaded refinement network. In particular, first, we propose to regroup the multi-level features into teacher and student features using a bifurcated backbone strategy (BBS). Second, we introduce a depth-enhanced module (DEM) to excavate informative depth cues from the channel and spatial views. Then, RGB and depth modalities are fused in a complementary way. Our architecture, named Bifurcated Backbone Strategy Network (BBS-Net), is simple, efficient, and backbone-independent. Extensive experiments show that BBS-Net significantly outperforms eighteen SOTA models on eight challenging datasets under five evaluation measures, demonstrating the superiority of our approach (∼4% improvement in S-measure vs. the top-ranked model: DMRA-iccv2019). In addition, we provide a comprehensive analysis on the generalization ability of different RGB-D datasets and provide a powerful training set for future research.

show abstract

RGB-D Salient Object Detection: A Survey

Cited by 4 publications

References 90 publications

RGB-D Salient Object Detection with Ubiquitous Target Awareness

RGB-D Salient Object Detection with Ubiquitous Target Awareness

Uncertainty Inspired RGB-D Saliency Detection

Bifurcated backbone strategy for RGB-D salient object detection

Contact Info

Product

Resources

About