2021 IEEE International Conference on Robotics and Automation (ICRA) 2021
DOI: 10.1109/icra48506.2021.9561801
|View full text |Cite
|
Sign up to set email alerts
|

Diffuser: Multi-View 2D-to-3D Label Diffusion for Semantic Scene Segmentation

Abstract: Semantic 3D scene understanding is a fundamental problem in computer vision and robotics. Despite recent advances in deep learning, its application to multi-domain 3D semantic segmentation typically suffers from the lack of extensive enough annotated 3D datasets. On the contrary, 2D neural networks benefit from existing large amounts of training data and can be applied to a wider variety of environments, sometimes even without need for retraining. In this paper, we present 'Diffuser', a novel and efficient mul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 20 publications
(5 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…Other works propose alternatives to the probabilistic Bayesian update for fusing semantic labels from multi-view 2D images into a 3D map. Mascaro et al [34] build a sparse diffusion graph connecting 2D pixels to 3D points and 3D points to their K nearest neighbors to propagate labels from a 2D image segmentation to the 3D model. After graph construction, iterative multiplication of the label matrix with a probabilistic transition matrix yields the diffused semantic labels.…”
Section: Related Workmentioning
confidence: 99%
“…Other works propose alternatives to the probabilistic Bayesian update for fusing semantic labels from multi-view 2D images into a 3D map. Mascaro et al [34] build a sparse diffusion graph connecting 2D pixels to 3D points and 3D points to their K nearest neighbors to propagate labels from a 2D image segmentation to the 3D model. After graph construction, iterative multiplication of the label matrix with a probabilistic transition matrix yields the diffused semantic labels.…”
Section: Related Workmentioning
confidence: 99%
“…It is worthwhile to mention that our method is not limited to a specific neural field method and can be extended easily to faster [19,40,15] and better-quality NeRFs [2,24]. Semantic segmentation in 3D: Semantic segmentation in 3D has been studied using multi-view fusion-based representations [1,22,11,20,39,21,44,52] that require only 2D supervision when training, and a separate 3D mesh at testing time, unlike implicit methods like ours. Recently, there have been promising attempts to recover 3D semantic maps from 2D inputs using NeRFs.…”
Section: Related Workmentioning
confidence: 99%
“…Many methods use one data modality to supervise or inform another [1,3,37,38,42,48,65,68,69,76,93,98,104,130,148]. For 3D semantic segmentation, multiview fusion [2,55,73,84,86,89,133,133,145] is a popular family of methods that require only image supervision. However, these methods reason exclusively in the image domain and require an input 3D substrate such as a point cloud or polygonal mesh on which to aggregate 2D information.…”
Section: Related Workmentioning
confidence: 99%