2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01205
|View full text |Cite
|
Sign up to set email alerts
|

SceneCode: Monocular Dense Semantic Reconstruction Using Learned Encoded Scene Representations

Abstract: Systems which incrementally create 3D semantic maps from image sequences must store and update representations of both geometry and semantic entities. However, while there has been much work on the correct formulation for geometrical estimation, state-of-the-art systems usually rely on simple semantic representations which store and update independent label estimates for each surface element (depth pixels, surfels, or voxels). Spatial correlation is discarded, and fused label maps are incoherent and noisy.We i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
36
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 67 publications
(36 citation statements)
references
References 29 publications
0
36
0
Order By: Relevance
“…A sample reconstruction is shown in Figure 5. With computer vision research evolving rapidly, we believe constructing geometry proxies from video input will become even more robust and easily accessible [3,69].…”
Section: Geometry Reconstructionmentioning
confidence: 99%
“…A sample reconstruction is shown in Figure 5. With computer vision research evolving rapidly, we believe constructing geometry proxies from video input will become even more robust and easily accessible [3,69].…”
Section: Geometry Reconstructionmentioning
confidence: 99%
“…Preliminary work exists towards joint geometric and semantic SLAM (e.g. [5]); yet, these systems are fairly limited in terms of accuracy and scaling. Instead, the majority of state-of-the-art work relies on a sequential geometric reconstruction and frame-wise labelling, followed by semantic fusion.…”
Section: Related Workmentioning
confidence: 99%
“…Hu et al [51] finally extend this approach to a complete SLAM framework that optimises a larger scale graph over many frames and multiple complex-shaped objects of different classes (e.g., chairs, tables) as well as latent shape representations for each object instance in parallel. Note that low-dimensional latent representations for modelling 3D geometry have also been utilised by Bloesch et al [52] and Zhi et al [53]. More specifically, they rely on photometric consistency to optimise codes in each keyframe that generate depth maps using a deconvolutional architecture.…”
Section: Review Of Current Slam Systems and Their Evolution Into Spatmentioning
confidence: 99%