2022
DOI: 10.48550/arxiv.2210.03043
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Feature-Realistic Neural Fusion for Real-Time, Open Set Scene Understanding

Abstract: General scene understanding for robotics requires flexible semantic representation, so that novel objects and structures which may not have been known at training time can be identified, segmented and grouped. We present an algorithm which fuses general learned features from a standard pretrained network into a highly efficient 3D geometric neural field representation during real-time SLAM. The fused 3D feature maps inherit the coherence of the neural field's geometry representation. This means that tiny amoun… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…The neural network is parametrized by spatial coordinates and pixel value estimates can be recovered by the integration along projection rays described in [41]. This system and its derivatives [42] use RGB-D inputs in tracking and mapping. Model architecture aside, this approach is otherwise analogous to more traditional ones like [21][22][23]35].…”
Section: Orb-slam (2015)mentioning
confidence: 99%
See 1 more Smart Citation
“…The neural network is parametrized by spatial coordinates and pixel value estimates can be recovered by the integration along projection rays described in [41]. This system and its derivatives [42] use RGB-D inputs in tracking and mapping. Model architecture aside, this approach is otherwise analogous to more traditional ones like [21][22][23]35].…”
Section: Orb-slam (2015)mentioning
confidence: 99%
“…First, given semantic image segmentations (refer to Section 3.3 for more detail), the model can also be trained to output discrete [51] or open-set [52] semantic labels for each point in space. Second, taking advantage of the inherently projective nature of NRFs making recovery of depth information trivial, and the differentiable nature of neural networks, an NRF can be optimized along with the pose estimates and, thus, replace a traditional map in a real-time RGB-D SLAM system [37,42].…”
Section: Implicit Scene Modelsmentioning
confidence: 99%
“…Neural implicit representations infer scene semantics [4], [5], [37]- [42] jointly with geometry using a multi-layer perceptron or similar parametric model. These have been extended to dynamic scenes [43].…”
Section: Semantic Scene Representationsmentioning
confidence: 99%
“…These have been extended to dynamic scenes [43]. Neural feature fields [5], [38], [44], [45] are neural implicit representations which map continuous 3D coordinates to vector-valued features. Such representations have shown remarkable ability at scene segmentation and editing.…”
Section: Semantic Scene Representationsmentioning
confidence: 99%
See 1 more Smart Citation