2021
DOI: 10.1109/tip.2021.3049332
|View full text |Cite
|
Sign up to set email alerts
|

Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation

Abstract: 3D spatial information is known to be beneficial to the semantic segmentation task. Most existing methods take 3D spatial data as an additional input, leading to a two-stream segmentation network that processes RGB and 3D spatial information separately. This solution greatly increases the inference time and severely limits its scope for real-time applications. To solve this problem, we propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integra… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 114 publications
(36 citation statements)
references
References 56 publications
0
30
0
Order By: Relevance
“…Datasets and metrics. Among the existing RGB-D segmentation problems, the indoor semantic segmentation is rather challenging, as the objects are often complex and with severe occlusions [5]. Thus, in order to validate the effectiveness of the proposed method, we conducted experiments on three indoor RGB-D benchmarks: NYU-Depth-V2 (NYUDv2-13 and -40) [25], SUN-RGBD [26] and Stanford Indoor Dataset (SID) [1].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Datasets and metrics. Among the existing RGB-D segmentation problems, the indoor semantic segmentation is rather challenging, as the objects are often complex and with severe occlusions [5]. Thus, in order to validate the effectiveness of the proposed method, we conducted experiments on three indoor RGB-D benchmarks: NYU-Depth-V2 (NYUDv2-13 and -40) [25], SUN-RGBD [26] and Stanford Indoor Dataset (SID) [1].…”
Section: Methodsmentioning
confidence: 99%
“…However, implementing such a idea is non-trivial due to the asymmetric modality problem between the RGB and the depth information. To tackle this, researchers have devoted efforts into two directions: designing dedicated architectures for RGB-D data [6,8,13,15,17,21,28], and presenting novel layers to enhance or replace the convolutional layers in RGB semantic segmentation [5,27,30]. Our method falls into the second category.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The difference between appearance information and geometric information is the diverse part of these two modalities. The main focus to achieve multi-modal learning for RGB-D data is through using different fusion strategies [42,4,31,24], e.g. early fusion, late fusion or cross-level fusion.…”
Section: Multi-modal Learning With Rgb-d Datasetmentioning
confidence: 99%
“…For RGBD input, current Sconv [2] suggests learning the RGB offset from a semantic depth feature map. We share the same motivation as Sconv.…”
Section: Offset Generatormentioning
confidence: 99%