SemanticFusion: Dense 3D semantic mapping with convolutional neural networks

McCormac, John; Handa, Ankur; Davison, Andrew J.; Leutenegger, Stefan

doi:10.1109/icra.2017.7989538

Cited by 583 publications

(521 citation statements)

References 24 publications

Supporting

Mentioning

521

Contrasting

Order By: Relevance

“…We solve this issue by utilizing the dense pixel‐wise semantic probability distribution produced by the neural network over every class. Therefore, we can improve the fusion accuracy by projecting the labels with a statistical approach using the Bayesian fusion [ASZ*16] [HFL14] [ZSS17] [MHDL17]. Bayesian fusion enables us to update the label prediction

l_{i}

on 2D images

I_{k}

within the common coordinate frame of the 3D model:

P (x = l_{i} | I_{1, \dots, k}) = \frac{1}{Z} P (x = l_{i} | I_{1, \dots, k - 1}) P (x = l_{i} | I_{k}),

…”

Section: Methodsmentioning

confidence: 99%

“…Combined with SLAM systems, 2D semantic segmentation can be achieved in 3D environments [RA17] [TTLN17] [ZSS17] [MHDL17], a promising future in robotic vision understanding and autonomous driving. Unlike these existing methods that aimed at providing the semantic understanding of the scene for robots, we are focusing our attention on human interactions.…”

Section: Previous Workmentioning

confidence: 99%

See 1 more Smart Citation

Context‐Aware Mixed Reality: A Learning‐Based Framework for Semantic‐Level Interaction

Chen

Tang

John

et al. 2019

Computer Graphics Forum

View full text Add to dashboard Cite

Mixed reality (MR) is a powerful interactive technology for new types of user experience. We present a semantic‐based interactive MR framework that is beyond current geometry‐based approaches, offering a step change in generating high‐level context‐aware interactions. Our key insight is that by building semantic understanding in MR, we can develop a system that not only greatly enhances user experience through object‐specific behaviours, but also it paves the way for solving complex interaction design challenges. In this paper, our proposed framework generates semantic properties of the real‐world environment through a dense scene reconstruction and deep image understanding scheme. We demonstrate our approach by developing a material‐aware prototype system for context‐aware physical interactions between the real and virtual objects. Quantitative and qualitative evaluation results show that the framework delivers accurate and consistent semantic information in an interactive MR environment, providing effective real‐time semantic‐level interactions.

show abstract

l_{i}

on 2D images

I_{k}

within the common coordinate frame of the 3D model:

P (x = l_{i} | I_{1, \dots, k}) = \frac{1}{Z} P (x = l_{i} | I_{1, \dots, k - 1}) P (x = l_{i} | I_{k}),

…”

Section: Methodsmentioning

confidence: 99%

Section: Previous Workmentioning

confidence: 99%

Context‐Aware Mixed Reality: A Learning‐Based Framework for Semantic‐Level Interaction

Chen

Tang

John

et al. 2019

Computer Graphics Forum

View full text Add to dashboard Cite

show abstract

“…There has been a great interest from the computer vision and robotics communities to exploit object-level information since from the perspective of many applications, it is beneficial to explore the awareness that object instances can provide for assistive computer vision [7,8,9], tracking/SLAM [10,11], or place categorization/scene recognition and life-long mapping [12,13].…”

Section: Related Workmentioning

confidence: 99%

“…For instance, Nascimento et al [30] applied a binary RGB-D descriptor to feed an Adaboost learning method to classify objects in a navigation task. McCormac et al [11] proposed a method for semantic 3D mapping. Their work combined the formulation of Whelan et al [31], an RGB-D based SLAM system for building a dense point cloud of the scene, with an encoder-decoder convolutional network for pixel-wise semantic segmentation.…”

Section: Slam and Augmented Semantic Representationsmentioning

confidence: 99%

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework Using Visual and Depth Cues

Martins

Bersan

Campos

et al. 2020

J Intell Robot Syst

View full text Add to dashboard Cite

This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images. We propose a complete framework to create an enhanced map representation of the environment with object-level information to be used in several applications such as human-robot interaction, assistive robotics, visual navigation, or in manipulation tasks. Our formulation leverages a CNN-based object detector (Yolo) with a 3D model-based segmentation technique to perform instance semantic segmentation, and to localize, identify, and track different classes of objects in the scene. The tracking and positioning of semantic classes is done with a dictionary of Kalman filters in order to combine sensor measurements over time and then providing more accurate maps. The formulation is designed to identify and to disregard dynamic objects in order to obtain a mediumterm invariant map representation. The proposed method was evaluated with collected and publicly available RGB-D data sequences acquired in different indoor scenes. Experimental results show the potential of the technique to produce augmented semantic maps containing several objects (notably doors). We also provide to the community a dataset composed of annotated object classes (doors, fire extinguishers, benches, water fountains) and their positioning, as well as the source code as ROS packages. 1 1 Preprint paper version to appear at Journal of Intelligent & Robotic Systems, available online at: https://doi.

show abstract

“…The Dense Planar SLAM system [4] seamlessly maps an environment using planar and non-planar regions while tracking the sensor pose in real-time. SemanticFusion [5] combines Convolutional Neural Networks (CNNs) and a state-of-the-art dense SLAM system, ElasticFusion to create a dense semantic map. All of the four semantic SLAM systems mentioned above are focusing on the dense map, though semantic SLAM on the sparse map is fairly enough for some applications in fact.…”

Section: Introductionmentioning

confidence: 99%