NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

Vora, Suhani; Radwan, Noha; Greff, Klaus; Meyer, Henning; Genova, Kyle; Sajjadi, Mehdi S. M.; Pot, Etienne; Tagliasacchi, Andrea; Duckworth, Daniel

doi:10.48550/arxiv.2111.13260

Cited by 9 publications

(14 citation statements)

References 99 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We demonstrate two representative baselines for 2D image and 3D point cloud segmentation: DeepLab [15] and SparseConvNet [35], respectively. In addition, we compare these methods with NeSF [97], a method for dense 2D and 3D scene segmentation from posed RGB images. We train all methods with semantic supervision derived from 9 cameras per scene from 500 scenes and hold out 4 cameras per scene from the remaining 25 scenes for evaluation.…”

Section: Scene Semantic Segmentationmentioning

confidence: 99%

See 1 more Smart Citation

Kubric: A scalable dataset generator

Greff¹,

Belletti²,

Beyer³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details. But collecting, processing and annotating real data at scale is difficult, expensive, and frequently raises additional privacy, fairness and legal concerns. Synthetic data is a powerful tool with the potential to address these shortcomings: 1) it is cheap 2) supports rich ground-truth annotations 3) offers full control over data and 4) can circumvent or mitigate problems regarding bias, privacy and licensing. Unfortunately, software tools for effective data generation are less mature than those for architecture design and training, which leads to fragmented generation efforts. To address these problems we introduce Kubric, an open-source Python framework that interfaces with PyBullet and Blender to generate photo-realistic scenes, with rich annotations, and seamlessly scales to large jobs distributed over thousands of machines, and generating TBs of data. We demonstrate the effectiveness of Kubric by presenting a series of 13 different generated datasets for tasks ranging from studying 3D NeRF models to optical flow estimation. We release Kubric, the used assets, all of the generation code, as well as the rendered datasets for reuse and modification. segmentation depth optical flow surface normals object coordinatesFigure 1. Example scene created and rendered with Kubric along with some of the automatically generated annotations.

show abstract

Section: Scene Semantic Segmentationmentioning

confidence: 99%

“…NeSF, on the other hand, must infer 3D geometry and semantics from posed 2D images alone. Further results and comparison to NeSF are presented in [97].…”

Section: Scene Semantic Segmentationmentioning

confidence: 99%

Kubric: A scalable dataset generator

Greff¹,

Belletti²,

Beyer³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Nevertheless, in 3D scene editing, similar capabilities are still limited due to the high demand for multi-view consistency. Existing approaches either rely on laborious annotation [28,73,75,78], only support object deformation or translation [32,65,67,78], or only perform global style transfer [12,13,16,21,79] without strong semantic meaning. Recently, 3D-aware GANs [8,9,18,25,48,60,63] and semantic NeRF editing [37,68] learn a latent space of the category and enable editing via latent code control.…”

Section: Related Workmentioning

confidence: 99%

SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

Bao¹,

Zhang²,

Yang³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Especially, neural radiance field (NeRF) and its variants (Mildenhall et al, 2020a; Barron et al, 2021) adopt multi-layer perceptrons (MLPs) to learn continuous representation and utilize calibrated multi-view images to render unseen views with fine-grained details. Besides rendering quality, the ability of scene understanding has been explored by several recent works (Vora et al, 2021;Yang et al, 2021;Zhi et al, 2021). Nevertheless, they either require dense view annotations to train a heavy 3D backbone for capturing semantic representations (Vora et al, 2021;Yang et al, 2021), or necessitate human intervention to provide sparse semantic labels (Zhi et al, 2021).…”

mentioning

confidence: 99%

“…Besides rendering quality, the ability of scene understanding has been explored by several recent works (Vora et al, 2021;Yang et al, 2021;Zhi et al, 2021). Nevertheless, they either require dense view annotations to train a heavy 3D backbone for capturing semantic representations (Vora et al, 2021;Yang et al, 2021), or necessitate human intervention to provide sparse semantic labels (Zhi et al, 2021). Recent self-supervised object discovery approaches on neural radiance fields (Yu et al, 2021c;Stelzner et al, 2021) try to decompose objects from givens scenes on the synthetic indoor data.…”

mentioning

confidence: 99%

NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes

Zhang¹,

Wang²,

Gong³

et al. 2022

Preprint

View full text Add to dashboard Cite

Neural volumetric representations have shown the potential that Multi-layer Perceptrons (MLPs) can be optimized with multi-view calibrated images to represent scene geometry and appearance without explicit 3D supervision. Object segmentation can enrich many downstream applications based on the learned radiance field. However, introducing hand-crafted segmentation to define regions of interest in a complex real-world scene is non-trivial and expensive as it acquires per view annotation. This paper carries out the exploration of self-supervised learning for object segmentation using NeRF for complex real-world scenes. Our framework, called NeRF with Self-supervised Object Segmentation (NeRF-SOS), couples object segmentation and neural radiance field to segment objects in any view within a scene. By proposing a novel collaborative contrastive loss in both appearance and geometry levels, NeRF-SOS encourages NeRF models to distill compact geometry-aware segmentation clusters from their density fields and the self-supervised pre-trained 2D visual features. The self-supervised object segmentation framework can be applied to various NeRF models that both lead to photo-realistic rendering results and convincing segmentation maps for both indoor and outdoor scenarios. Extensive results on the LLFF, BlendedMVS, CO3Dv2, and Tank & Temples datasets validate the effectiveness of NeRF-SOS. It consistently surpasses other 2D-based self-supervised baselines and predicts finer object masks than existing supervised counterparts. Please refer to the video on our project page for more details: https://zhiwenfan.github.io/NeRF-SOS/.Recently, neural volumetric rendering techniques have shown great power in scene reconstruction. Especially, neural radiance field (NeRF) and its variants (Mildenhall et al., 2020a; Barron et al., 2021) adopt multi-layer perceptrons (MLPs) to learn continuous representation and utilize calibrated multi-view images to render unseen views with fine-grained details. Besides rendering quality, the ability of scene understanding has been explored by several recent works (Vora et al., 2021;Yang et al., 2021;Zhi et al., 2021). Nevertheless, they either require dense view annotations to train a heavy 3D backbone for capturing semantic representations (Vora et al., 2021;Yang et al., 2021), or necessitate human intervention to provide sparse semantic labels (Zhi et al., 2021). Recent self-supervised object discovery approaches on neural radiance fields (Yu et al., 2021c;Stelzner et al., 2021) try to decompose objects from givens scenes on the synthetic indoor data. However, still remains a gap to be applied in complex real-world scenarios.

show abstract

NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

Cited by 9 publications

References 99 publications

Kubric: A scalable dataset generator

Kubric: A scalable dataset generator

SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes

Contact Info

Product

Resources

About